nuclear-multimessenger-astronomy / nmma

A pythonic library for probing nuclear physics and cosmology with multimessenger analysis
https://nuclear-multimessenger-astronomy.github.io/nmma/
GNU General Public License v3.0
33 stars 58 forks source link

DOI not retrievable with too many processes #263

Open haukekoehn opened 1 year ago

haukekoehn commented 1 year ago

Describe the bug When starting a multimessenger analysis nmma-analysis with many processes (>100), the function get_latest_zenodo_doi in nmma/utils/models.py is called with every processes. Instead of providing the latest zenodo doi it returns a 429 too many requests error.

To Reproduce srun -n 128 nmma-analysis --args

Expected behavior To simply return the latest doi from zenodo. But in principle only one process would be needed for this, as also just one process should download everything. Instead now every process is downloading everything.

Platform information:

Additional context In my recent pull request, I have put a work around into load_models_list to set the doi to the PERMANENT_DOI if get_latest_zenodo_doi fails to retrieve it.

mcoughlin commented 1 year ago

@haukekoehn yeah we saw this early on and our recommendation is to just run the download directly first and then to trigger the MPI script. It's tricky to work around.

haukekoehn commented 1 year ago

perhaps the easiest thing would be to assert the initialization of the AnalysisRun instance in nmma-analysis to only one process (which would be totally efficient from how i understand it)