DOI not retrievable with too many processes

haukekoehn commented 1 year ago

Describe the bug When starting a multimessenger analysis nmma-analysis with many processes (>100), the function get_latest_zenodo_doi in nmma/utils/models.py is called with every processes. Instead of providing the latest zenodo doi it returns a 429 too many requests error.

To Reproduce srun -n 128 nmma-analysis --args

Expected behavior To simply return the latest doi from zenodo. But in principle only one process would be needed for this, as also just one process should download everything. Instead now every process is downloading everything.

Platform information:

NMMA version: 0.1.0

Additional context In my recent pull request, I have put a work around into load_models_list to set the doi to the PERMANENT_DOI if get_latest_zenodo_doi fails to retrieve it.

mcoughlin commented 1 year ago

@haukekoehn yeah we saw this early on and our recommendation is to just run the download directly first and then to trigger the MPI script. It's tricky to work around.

haukekoehn commented 1 year ago

perhaps the easiest thing would be to assert the initialization of the AnalysisRun instance in nmma-analysis to only one process (which would be totally efficient from how i understand it)

nuclear-multimessenger-astronomy / nmma

DOI not retrievable with too many processes #263