litebird / litebird_sim

Simulation tools for LiteBIRD
GNU General Public License v3.0
18 stars 13 forks source link

Singularity/litebird_sim installation error on JSS3 #159

Open ebikengo opened 2 years ago

ebikengo commented 2 years ago

Hi, this is Ken EBISAWA at ISAS/JAXA. I am trying to install Singularity/litebird_sim package on JAXA's super-computer JSS3. This is what I did: git clone https://github.com/litebird/litebird_sim litebird_sim

Followed the instruction at https://litebird-sim.readthedocs.io/en/latest/installation.html

% cd singularity/ % ./create-singularity-file.sh 20.04 openmpi % module load singularity <-- this is to enable singularity on JSS3 % which singularity /opt/JX/oss/x86_64/singularity/3.6.4/bin/singularity % singularity build --fakeroot litebird_sim.img Singularity

Then, I got the following errors: test/test_destriper.py::test_destriper FAILED [ 3%] test/test_dipole.py::test_solar_dipole_fit FAILED [ 20%]

I already reported this problem to Maurizio, who says this is due to difference of Python version ( these errors should no happen with v 3.6, but Python version used by the Singularity container is < 3.4).

I just put this post for a record, instead of just e-mailing to Maurizio and developpers. I new version of the package is made, please let me know. I will try it.

ziotom78 commented 2 years ago

I tested this on the latest master, and the two tests pass. Here is the result of tail -n 50 of the output, might you please post the same in your case?

Python 3.8.10
+ gcc --version
gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Copyright (C) 2019 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

+ python3 -c import litebird_sim as lbs; print('Litebird_sim version: ', lbs.__version__)
TOAST INFO: mpi4py not found- using serial operations only
Litebird_sim version:  0.4.0
INFO:    Adding help info
INFO:    Adding environment to container
INFO:    Adding runscript
INFO:    Adding testscript
INFO:    Running testscript
============================= test session starts ==============================
platform linux -- Python 3.8.10, pytest-5.4.3, py-1.11.0, pluggy-0.13.1
rootdir: /opt/litebird_sim
collected 64 items

test/test_compress.py .                                                  [  1%]
test/test_destriper.py .                                                 [  3%]
test/test_detectors.py .........                                         [ 17%]
test/test_dipole.py ..                                                   [ 20%]
test/test_io.py .......                                                  [ 31%]
test/test_mapping.py ...                                                 [ 35%]
test/test_mbs.py .                                                       [ 37%]
test/test_mock_imo.py .....                                              [ 45%]
test/test_mpi.py ........                                                [ 57%]
test/test_noise.py .                                                     [ 59%]
test/test_observations.py ..                                             [ 62%]
test/test_quaternions.py .....                                           [ 70%]
test/test_scan_map.py .                                                  [ 71%]
test/test_scanning.py .........                                          [ 85%]
test/test_simulations.py ........                                        [ 98%]
test/test_spacecraft.py .                                                [100%]

=============================== warnings summary ===============================
/usr/local/lib/python3.8/dist-packages/_pytest/stepwise.py:108
  /usr/local/lib/python3.8/dist-packages/_pytest/stepwise.py:108: PytestCacheWarning: cache could not write path /opt/litebird_sim/.pytest_cache/v/cache/stepwise
    self.config.cache.set("cache/stepwise", [])

/usr/local/lib/python3.8/dist-packages/_pytest/cacheprovider.py:366
  /usr/local/lib/python3.8/dist-packages/_pytest/cacheprovider.py:366: PytestCacheWarning: cache could not write path /opt/litebird_sim/.pytest_cache/v/cache/nodeids
    config.cache.set("cache/nodeids", self.cached_nodeids)

-- Docs: https://docs.pytest.org/en/latest/warnings.html
================== 64 passed, 2 warnings in 110.84s (0:01:50) ==================
INFO:    Creating SIF file...
INFO:    Build complete: litebird_sim.img
ebikengo commented 2 years ago

Here it is;

[e578@toki singularity]$ tail -n 50 singularity.txt copying images... [ 55%] images/polarization-direction.svg copying images... [ 61%] images/scanning-strategy-example.png copying images... [ 66%] images/right-handed-coordinates.svg copying images... [ 72%] images/simple-scanning-strategy.png copying images... [ 77%] images/jupiter-angular-distance.svg copying images... [ 83%] images/report_example.png copying images... [ 88%] images/mbs_i.png copying images... [ 94%] images/tutorial-bare-report.png copying images... [100%] images/tutorial-coverage-map.png

copying static files... done copying extra files... done dumping search index in English (code: en)... done dumping object inventory... done build succeeded, 1 warning.

The HTML pages are in build/html. Copying contentui stylesheet/javascript... done make: Leaving directory '/opt/litebird_sim/docs' Running the tests... ============================= test session starts ============================== platform linux -- Python 3.8.10, pytest-5.4.3, py-1.11.0, pluggy-0.13.1 -- /usr/bin/python3 cachedir: .pytest_cache rootdir: /opt/litebird_sim collecting ... collected 64 items

test/test_compress.py::test_rle_compression PASSED [ 1%] test/test_destriper.py::test_destriper FAILED [ 3%] test/test_detectors.py::test_detector_from_dict PASSED [ 4%] test/test_detectors.py::test_detector_from_toml PASSED [ 6%] test/test_detectors.py::test_detector_from_imo PASSED [ 7%] test/test_detectors.py::test_freq_channel_creation PASSED [ 9%] test/test_detectors.py::test_freq_channel_from_imo PASSED [ 10%] test/test_detectors.py::test_freq_channel_noise PASSED [ 12%] test/test_detectors.py::test_instrument_creation PASSED [ 14%] test/test_detectors.py::test_instrument_from_imo PASSED [ 15%] test/test_detectors.py::test_det_list_from_imo PASSED [ 17%] test/test_dipole.py::test_dipole_models PASSED [ 18%] test/test_dipole.py::test_solar_dipole_fit FAILED [ 20%] test/test_io.py::test_write_healpix_map_to_hdu PASSED [ 21%] test/test_io.py::test_write_healpix_map PASSED [ 23%] test/test_io.py::test_write_simple_observation PASSED [ 25%] test/test_io.py::test_write_complex_observation_mjd PASSED [ 26%] test/test_io.py::test_write_complex_observation_no_mjd PASSED [ 28%] test/test_io.py::test_read_complex_observation_mjd PASSED [ 29%] test/test_io.py::test_read_complex_observation_no_mjd PASSED [ 31%] test/test_mapping.py::test_accumulate_map_and_info PASSED [ 32%] test/test_mapping.py::test_make_bin_map_api_simulation PASSED [ 34%] test/test_mapping.py::test_make_bin_map_basic_mpi PASSED [ 35%] test/test_mbs.py::test_mbs [e578@toki singularity]$

ziotom78 commented 2 years ago

That's really weird… When test fail, the system should print additional information about why a test failed and which line number included the failure. In this case, it seems that the output was truncated after test/test_mbs.py::test_mbs.

Could you please check that all the output was included?

ebikengo commented 2 years ago

This time, I ran without "tee". Thiis what I got: platform linux -- Python 3.8.10, pytest-5.4.3, py-1.11.0, pluggy-0.13.1 -- /usr/bin/python3 cachedir: .pytest_cache rootdir: /opt/litebird_sim collected 64 items

test/test_compress.py::test_rle_compression PASSED [ 1%] test/test_destriper.py::test_destriper FAILED [ 3%] test/test_detectors.py::test_detector_from_dict PASSED [ 4%] test/test_detectors.py::test_detector_from_toml PASSED [ 6%] test/test_detectors.py::test_detector_from_imo PASSED [ 7%] test/test_detectors.py::test_freq_channel_creation PASSED [ 9%] test/test_detectors.py::test_freq_channel_from_imo PASSED [ 10%] test/test_detectors.py::test_freq_channel_noise PASSED [ 12%] test/test_detectors.py::test_instrument_creation PASSED [ 14%] test/test_detectors.py::test_instrument_from_imo PASSED [ 15%] test/test_detectors.py::test_det_list_from_imo PASSED [ 17%] test/test_dipole.py::test_dipole_models PASSED [ 18%] test/test_dipole.py::test_solar_dipole_fit FAILED [ 20%] test/test_io.py::test_write_healpix_map_to_hdu PASSED [ 21%] test/test_io.py::test_write_healpix_map PASSED [ 23%] test/test_io.py::test_write_simple_observation PASSED [ 25%] test/test_io.py::test_write_complex_observation_mjd PASSED [ 26%] test/test_io.py::test_write_complex_observation_no_mjd PASSED [ 28%] test/test_io.py::test_read_complex_observation_mjd PASSED [ 29%] test/test_io.py::test_read_complex_observation_no_mjd PASSED [ 31%] test/test_mapping.py::test_accumulate_map_and_info PASSED [ 32%] test/test_mapping.py::test_make_bin_map_api_simulation PASSED [ 34%] test/test_mapping.py::test_make_bin_map_basic_mpi PASSED [ 35%] test/test_mbs.py::test_mbs FATAL: While performing build: while running engine: exit status 1 [e578@toki singularity]$ ./

ziotom78 commented 2 years ago

Judging from the last message printed by your terminal, it seems that we have two problems here:

The latter unfortunately makes the code crash and prevents it from printing useful output to investigate the former.

I have applied some blind patch to the test for MBS, might you please run a git pull and redo the test? If it still fails, I have added some more options to the Singularity scripts that we might use in that case.

ebikengo commented 2 years ago

Thanks, Maurizo. A problem of JAXA's JSS is that it cannot directly access to git hub! What I did is I git clone on a local machine, and copy do JSS. This is tedious, so I will submit a request to JSS so that it can access git hub directory. In any case, for the time being, I will try git pull on my local machine, and copy to JSS, and redo the test.

ebikengo commented 2 years ago

No good news yet (BTW, I was wrong to have said that git hub is not seen from JSS3. It WAS seen, and no problem).

Somehow mpi gives strange error, I tried without mpi

% ./create-singularity-file.sh 20.04 none % singularity build --fakeroot litebird_sim.img Singularity ... ============================================================================================ test session starts ============================================================================================ platform linux -- Python 3.8.10, pytest-5.4.3, py-1.11.0, pluggy-0.13.1 -- /usr/bin/python3 cachedir: .pytest_cache rootdir: /opt/litebird_sim collected 64 items

test/test_compress.py::test_rle_compression PASSED [ 1%] test/test_destriper.py::test_destriper PASSED [ 3%] test/test_detectors.py::test_detector_from_dict PASSED [ 4%] test/test_detectors.py::test_detector_from_toml PASSED [ 6%] test/test_detectors.py::test_detector_from_imo PASSED [ 7%] test/test_detectors.py::test_freq_channel_creation PASSED [ 9%] test/test_detectors.py::test_freq_channel_from_imo PASSED [ 10%] test/test_detectors.py::test_freq_channel_noise PASSED [ 12%] test/test_detectors.py::test_instrument_creation PASSED [ 14%] test/test_detectors.py::test_instrument_from_imo PASSED [ 15%] test/test_detectors.py::test_det_list_from_imo PASSED [ 17%] test/test_dipole.py::test_dipole_models PASSED [ 18%] test/test_dipole.py::test_solar_dipole_fit FAILED [ 20%] test/test_io.py::test_write_healpix_map_to_hdu PASSED [ 21%] test/test_io.py::test_write_healpix_map PASSED [ 23%] test/test_io.py::test_write_simple_observation PASSED [ 25%] test/test_io.py::test_write_complex_observation_mjd PASSED [ 26%] test/test_io.py::test_write_complex_observation_no_mjd PASSED [ 28%] test/test_io.py::test_read_complex_observation_mjd PASSED [ 29%] test/test_io.py::test_read_complex_observation_no_mjd PASSED [ 31%] test/test_mapping.py::test_accumulate_map_and_info PASSED [ 32%] test/test_mapping.py::test_make_bin_map_api_simulation PASSED [ 34%] test/test_mapping.py::test_make_bin_map_basic_mpi PASSED [ 35%] test/test_mbs.py::test_mbs FATAL: While performing build: while running engine: exit status 1 [e578@toki singularity]$

ebikengo commented 2 years ago

I will attach the entire log. I found the following message in the log. Can't this be a problem?

[2022-04-20 17:48:11,006 WARNING MPI#0000] IMO config file "/root/.config/litebird_imo/imo.toml" not found. --- Logging error --- Traceback (most recent call last): File "/opt/litebird_sim/litebird_sim/imo/imo.py", line 23, in init with CONFIG_FILE_PATH.open("rt") as inpf: File "/usr/lib/python3.8/pathlib.py", line 1222, in open return io.open(self, mode, buffering, encoding, errors, newline, File "/usr/lib/python3.8/pathlib.py", line 1078, in _opener return self._accessor.open(self, flags, mode) FileNotFoundError: [Errno 2] No such file or directory: '/root/.config/litebird_imo/imo.toml'

singularity_nompi.log .

ziotom78 commented 2 years ago

Hi @ebikengo , I have tried to reproduce your problem again and again, but with little success.

However, I realized that there might be limitations in the way singularity is supported on HPC machines. (For instance, I believe this is the case for kekcc, which recommends users to build containers on their own machines.)

Have you tried to build the Singularity image (i.e., the litebird_sim.img) on your own laptop and then just copy that file to JSS3? (Of course, to do this you have to install Singularity on your machine; this should not be a big deal if you are running Linux.)