exgalsky / xgfield

Generation of mocks from a field representation of LSS on the observer's past light cone.
GNU General Public License v3.0
0 stars 0 forks source link

High level interface and refactor #5

Closed marcelo-alvarez closed 1 year ago

marcelo-alvarez commented 1 year ago

Summary:

I have tested with a single GPU (there is an in-place install of this branch currently in the xgsmenv environment loaded below) with:

# on Perlmutter at NERSC with one GPU node:
% module use /global/cfs/cdirs/mp107/exgal/env/xgsmenv/20231013-0.0.0/modulefiles/
% module load xgsmenv
% salloc -N 1 -C gpu
% export LPT_DISPLACEMENTS_PATH=/pscratch/sd/m/malvarez/websky-displacements/
% xgfield "field test-768" --N 768 --Nside 1024

which produces a kappa map (note that xgfield was not preceded with srun).

Still to do before merging is testing on multiple nodes and processors.

@1cosmologist please review.

marcelo-alvarez commented 1 year ago

@1cosmologist note that the most recent commit has not been tested. It is a WIP towards minimal changes to allow sharded displacements to be passed in on GPUs, in place. Commit 7dfb3e8 should be stable for testing of previous changes.

marcelo-alvarez commented 1 year ago

@1cosmologist I have added a commit for sharded displacements passed in externally via a lpt.cube object and tested in serial on a Perlmutter login node with a CPU (see the README for more details on the test).

Perlmutter non-login nodes were down at the time of writing, and I will test for GPUs on multiple nodes once they are back up. I don't anticipate needing to add any additional features in this PR, so we can focus on review and testing now.

marcelo-alvarez commented 1 year ago

I have tested sharded displacements passed in externally via a lpt.cube object on multiple GPUs and the resulting kappa maps are consistent to near rounding error with the previous method (reading the displacements directly via files), e.g.:

% module use /global/cfs/cdirs/mp107/exgal/env/xgsmenv/20231013-0.0.0/modulefiles/
% module load xgsmenv
% export LPT_DISPLACEMENTS_PATH=/pscratch/sd/m/malvarez/websky-displacements/
% mkdir -p output 

% script=xgfield/scripts/cube_example.py
% cubecoms="python $script"
% filecoms="xgfield"
% $filecomp="srun -n 4 --gpus-per-task=1 $filecoms"
% $cubecomp="srun -n 4 --gpus-per-task=1 $cubecoms"
%
% $filecoms fieldsky-test-files --no-mpi # displacements from external file processed in serial
% $cubecoms fieldsky-test-cubes   serial # displacements from external cube processed in serial
% $filecomp fieldsky-test-filep          # displacements from external cube processed in parallel
% $cubecomp fieldsky-test-cubep parallel # displacements from external cube processed in parallel

% stat -c "%n,%s" output/kappa_fieldsky-test-* | column -t -s,
output/kappa_fieldsky-test-cubep-768_nside-1024.fits  100670400
output/kappa_fieldsky-test-cubes-768_nside-1024.fits  100670400
output/kappa_fieldsky-test-filep-768_nside-1024.fits  100670400
output/kappa_fieldsky-test-files-768_nside-1024.fits  100670400

% fitsdiff -q -a 1e-5 output/kappa_fieldsky-test-files-768_nside-1024.fits output/kappa_fieldsky-test-cubep-768_nside-1024.fits ; echo $?
0

This was the final validation test for this PR. Thanks @1cosmologist for your help on this. Merging now.