seshnadathur / Revolver

Real-space void locations from survey reconstruction
GNU General Public License v3.0
7 stars 12 forks source link

Error running revolver.py #16

Open Chanuntorn opened 2 years ago

Chanuntorn commented 2 years ago

An error occurs when running python revolver.py --par parameters/params.py .

==== Starting the void-finding with ZOBOV ==== 
Loading galaxy data from file...
Cutting galaxies outside the redshift limits provided
WARNING: AstropyDeprecationWarning: "verbose" was deprecated in version 1.15.0 and will be removed in a future version.  [python_tools.zobov]
Traceback (most recent call last):
  File "revolver.py", line 163, in <module>
    voidcat = ZobovVoids(parms)
  File "/global/u2/c/chanun/Revolver/python_tools/zobov.py", line 107, in __init__
    mask = hp.read_map(parms.mask_file, verbose=False)
  File "/global/homes/c/chanun/.conda/envs/revenv/lib/python3.8/site-packages/astropy/utils/decorators.py", line 547, in wrapper
    return function(*args, **kwargs)
  File "/global/homes/c/chanun/.conda/envs/revenv/lib/python3.8/site-packages/healpy/fitsfunc.py", line 379, in read_map
    nside = pixelfunc.npix2nside(pix.size)
  File "/global/homes/c/chanun/.conda/envs/revenv/lib/python3.8/site-packages/healpy/pixelfunc.py", line 1121, in npix2nside
    raise ValueError("Wrong pixel number (it is not 12*nside**2)")
ValueError: Wrong pixel number (it is not 12*nside**2)

Steps I completed after logging into jupyter.nersc

git clone https://github.com/seshnadathur/Revolver.git 
module load python
conda create --name revenv python=3.8
source activate revenv
conda install astropy scipy cython
pip install pyfftw healpy
conda install ipython

Then I edited parameters/params.py to match my file paths after downloading the SDSS files. In the Makefile, I changed the line make -C src all to make -C src all_nompi . I proceed with the following

export PYTHONPATH=/global/homes/c/chanun/Revolver:$PYTHONPATH
cd Revolver
make clean
make
python revolver.py --par parameters/params.py
moustakas commented 2 years ago

@Chanuntorn the following procedure worked for me just now, at least at NERSC:

  1. Log into NERSC. Following the instructions here, create a dedicated (local) conda environment and install Revolver's dependencies (plus a couple more handy libraries like matplotlib):

    module load python
    conda create -n desivoids python=3.8
    conda activate desivoids
    conda install -y numpy scipy astropy cython ipython matplotlib
    pip install healpy pyfftw
  2. A few notes regarding the preceding step:

    • I called my environment desivoids but you can call it whatever you want.
    • pyfftw doesn't work with python v3.9, so you have to use v3.8 at least for now.
    • After you've created your conda environment (or if you have one from previous projects or work) then you just need to type conda activate desivoids into your terminal without doing module load python. (In detail, conda modifies your .bashrc to run conda init every time you log in, which is not great practice, but here we are.)
  3. Next, create a working directory (obviously choose your own path) and then clone the Revolver code base into that directory. Now, before running make I had to hand-edit line 14 of the Makefile to change make -C src all to make -C src all_nompi, as instructed in the README if MPI wasn't available. I did try module load openmpi as instructed here first, but Revolver couldn't find the appropriate library and header files. It would be great to try to figure this out in future. In any case, do:

    mkdir /global/cscratch1/sd/ioannis/voids
    cd /global/cscratch1/sd/ioannis/voids
    git clone https://github.com/seshnadathur/Revolver.git
    cd Revolver
    [edit Makefile]
    make
  4. Next, to reproduce the SDSS/CMASS results, download the large-scale structure galaxy and random catalogs:

    cd /global/cscratch1/sd/ioannis/voids
    wget https://data.sdss.org/sas/dr12/boss/lss/galaxy_DR12v5_CMASS_South.fits.gz
    wget https://data.sdss.org/sas/dr12/boss/lss/random0_DR12v5_CMASS_South.fits.gz
    gunzip *.fits.gz
  5. Next, you need to modify the /global/cscratch1/sd/ioannis/voids/Revolver/parameters/params.py file to point to these catalogs. Use a text editor to update the paths and might as well use 32 threads. Note that the trailing slash on the output_folder directory name is crucial! Here's what my params.py file looks like

    handle = 'DR12_CMASS_South'
    output_folder = '/global/cscratch1/sd/ioannis/voids/out/'
    tracer_file = '/global/cscratch1/sd/ioannis/voids/galaxy_DR12v5_CMASS_South.fits'
    random_file = '/global/cscratch1/sd/ioannis/voids/random0_DR12v5_CMASS_South.fits'
    mask_file = 'masks/DR12v5_CMASS_South_mask.fits'
    void_prefix = 'recon-voids'
    verbose = True
    do_recon = True
    use_barycentres = False
    run_voxelvoids = True
    do_tessellation = True
    nthreads = 32
  6. Finally you can run Revolver, which for me took about 4 minutes on a login node (with 32 threads)!

    
    cd /global/cscratch1/sd/ioannis/voids/Revolver
    time python revolver.py --par parameters/params.py
    Loading parameters from parameters/params.py
    
    ==== Running reconstruction for real-space positions ====
    Loading galaxy data from file...
    Loading randoms data from file...
    
    ==== Starting the reconstruction ====
    Using values of growth rate f = 0.780 and bias b = 2.000
    Smoothing scale [Mpc/h]: 10.0
    Number of bins: 512
    Box size [Mpc/h]: 2904.55
    Bin size [Mpc/h]: 5.67
    Loop 0
    Reading wisdom from  wisdom.512.32
    Creating FFTW objects...
    Allocating randoms in cells...
    Smoothing...
    Allocating galaxies in cells...
    Smoothing galaxy density field ...
    Computing density fluctuations, delta...
    Fourier transforming delta field...
    Inverse Fourier transforming to get psi...
    Calculating shifts...
    Loop 1
    Allocating galaxies in cells...
    Smoothing galaxy density field ...
    Computing density fluctuations, delta...
    Fourier transforming delta field...
    Inverse Fourier transforming to get psi...
    Calculating shifts...
    Loop 2
    Allocating galaxies in cells...
    Smoothing galaxy density field ...
    Computing density fluctuations, delta...
    Fourier transforming delta field...
    Inverse Fourier transforming to get psi...
    Calculating shifts...
    ==== Done reconstruction ====

Reconstruction took 44.655 seconds Loading galaxy data from file...

==== Starting the void-finding with voxel-based method ==== 217780 tracers found Box size [Mpc/h]: 2544.546 Initial bin size [Mpc/h]: 12.79, nbins = 199 Final bin size [Mpc/h]: 5.68, nbins = 448 Smoothing scale [Mpc/h]: 18.29 Allocating galaxies in cells... Allocating randoms in cells... Smoothing density fields ... Finding density minima Total number of voxels: 89915392 Reading in the dens data from file ... Setting the voxel adjacencies ... Finding jumper for each voxel About to jump ... Post-jump ... 10979 zones found... Finding zone borders Allocating zone adjacencies and links Finding weakest links Found zone adjacencies delta ranges from -9.945806e-01 to 4.009290e+00. Writing the zone memberships Post-processing voids Total 2613 voids pass all cuts ==== Finished with voxel-based method ==== Voxel voids took 39.823 seconds

==== Starting the void-finding with ZOBOV ==== Loading galaxy data from file... Cutting galaxies outside the redshift limits provided WARNING: AstropyDeprecationWarning: "verbose" was deprecated in version 1.15.0 and will be removed in a future version. [python_tools.zobov] Kept 208491 tracers after all cuts Determining survey redshift selection function ... Generating buffer mocks around survey edges ... buffer mocks will have 10.0 x the galaxy number density WARNING: AstropyDeprecationWarning: "verbose" was deprecated in version 1.15.0 and will be removed in a future version. [python_tools.zobov] placed 135214 buffer mocks at high-redshift cap placed 55543 buffer mocks at low-redshift cap WARNING: AstropyDeprecationWarning: "verbose" was deprecated in version 1.15.0 and will be removed in a future version. [python_tools.zobov] placed 354366 buffer mocks along the survey boundary edges Using box length 2774.22 added 24292 guards to stabilize the tessellation Buffer mocks written to file /global/cscratch1/sd/ioannis/voids/out/DR12_CMASS_South_mocks.npy Calling vozisol to do the tessellation... Tessellation done. Post-processing voids ... Identified 2138 potential voids. Now extracting circumcentres ... WARNING: AstropyDeprecationWarning: "verbose" was deprecated in version 1.15.0 and will be removed in a future version. [python_tools.zobov] Removed 0 edge-contaminated voids ==== Finished with ZOBOV-based method ==== ZOBOV took 132.518 seconds

real 4m8.975s user 6m8.724s sys 0m58.498s

moustakas commented 2 years ago

For the record I did try to do all of this in a Docker/shifter container (including MPI), but the way the Revolver package is set up made it challenging to add the Python code to PYTHONPATH and to also use a parameter file in a non-standard location. But that's a different ticket!

seshnadathur commented 2 years ago

Thanks for looking into this and suggesting the workaround – I have not had time to test Revolver out on NERSC yet. A few questions/comments:

  1. What is the error you get when trying to compile the MPI version? When running this on a (non-NERSC) HPI system I have only needed to do the equivalent of your module load openmpi and things have worked fine.
  2. If you are not compiling with MPI then the only part of the code that is not running single-thread is the reconstruction prior to void-finding. As you can see from the timings above, reconstruction is not the slowest step. So lots of threads then make for rather an inefficient use of resources, and in the interest of budgeting your NERSC allocation it would be better to use fewer.
  3. Having said that, some parts of the code (especially reconstruction and the voxel void-finding) can require a lot of memory because of needing an N^3 grid. On some HPC systems it might be necessary to increase the number of threads to get sufficient memory. This is an efficiency issue for Revolver that needs work.
  4. I don't understand the issue with using a parameter file in a non-standard location, could you elaborate (maybe on another ticket?) This should be a quick and easy fix.
seshnadathur commented 2 years ago

I should also mention that as soon as I have the time, I am planning a major rewrite of all the Revolver code, including transitioning to using pyrecon to provide the reconstruction, and using YAML input in place of the current system. This will change some of the workflow, but hopefully in a positive way, and making it more consistent with other DESI code.