Scaling up differentiable PM simulations on Perlmutter for Cosmological Forward Modeling
Improving the framework for running differentiable lensing and clustering simulations to scale to larger volumes (i.e. multi-gpu parallelisation; pretty technical), and adding currently missing components (IA models, pz models, galaxy bias models, etc.; new contributor friendly!)
[ ] : Implement multi-GPU distribution of JaxPM nbody model (single node)
[ ] : [stretch goal] implement multi-GPU distribution over multiple nodes!
[ ] : Merge missing modeling components (IA, pz, bias) into main branch
[ ] : Implement validation tests for the simulated fields
Resources and skills needed
This sprint has two different skills requirements:
For the multi-GPU parallelisation stuff (!!! very technical !!!): Enthusiasm for parallel computing :-) Ideally previous experience with distributed ML / underlying NCCL tensorflow/jax primitives for distributed collectives / experience with JAX pmap or xmap.
For the model extension stuff (adding missing components like IA): Just enthusiasm! Ideally some pre-existing knowledge about the particular effect to be implemented, but also a good opportunity to learn!
Scaling up differentiable PM simulations on Perlmutter for Cosmological Forward Modeling
Improving the framework for running differentiable lensing and clustering simulations to scale to larger volumes (i.e. multi-gpu parallelisation; pretty technical), and adding currently missing components (IA models, pz models, galaxy bias models, etc.; new contributor friendly!)
Contacts: @EiffL Day/Time: All week Main communication channel: #desc-sprint-bayescosmo GitHub repo: https://github.com/LSSTDESC/bayesian-pipelines-cosmology Zoom room (if applicable): DESC #19
Goals and deliverable
Resources and skills needed
This sprint has two different skills requirements:
For the multi-GPU parallelisation stuff (!!! very technical !!!): Enthusiasm for parallel computing :-) Ideally previous experience with distributed ML / underlying NCCL tensorflow/jax primitives for distributed collectives / experience with JAX pmap or xmap.
For the model extension stuff (adding missing components like IA): Just enthusiasm! Ideally some pre-existing knowledge about the particular effect to be implemented, but also a good opportunity to learn!
Detailed description
(coming soon)