Closed haukekoehn closed 11 months ago
@haukekoehn I suspect this a python version issue. Are you on 3.11 perhaps?
@mcoughlin no python 3.8
@haukekoehn does moving to 3.9 make any difference?
no
@sahiljhawar @bfhealy can anyone reproduce?
@mcoughlin I am unaware of GW+ analyses. Can't help with this.
However, @haukekoehn can you check the pickle file by loading it and checking its keys? \x0a
is an LF character, if it helps.
@sahiljhawar i was more asking you to confirm if you can load the file.
@mcoughlin Ahh okay. Let me check.
If Bu2022Ye.pkl
is same as Bu2022Ye_tf.pkl
from Zenodo: 8164628; then yes, I am able to load it using numpy
@haukekoehn Can you download that file separately and check the same?
I am currently trying to see whether it works on another machine.
But yes, downloading Bu2022Ye_tf.pkl
and reading it in works.
So i wanna start a GW+KN+GRB inference from scratch and only provide an empty directory for the svd models. Then it does its thing and starts downloading Bu2022Ye.pkl
(no tf!). I just looked into this file and it actually looks like an html file, not a binary file.
Ok so that means maybe the download is failing... maybe lack of Zenodo access for that machine? What's the error?
@haukekoehn In that case you can get the model from /home/enlil/jhawar/svdmodels
and start your run with local files
alright, thanks, i will try!
On the other machine it fails even earlier and raises a ValueError model_name Bu2022Ye_tf not found in models list
I think you need models.yaml
file as well
I know managed to get it running (or at least initializing the live points :D), but interestingly enough, it only downloaded the models when it was run with a single process, i.e. only nmma-analysis, not srun nmma-analysis. Then the models appeared in the directory and i could restart with srun - n 1024 nmma-analysis.
@tsunhopang maybe we need a line to check whether the model is there and indicate to first run the download if it isn't and then the user can rerun?
@mcoughlin @bfhealy Part of the problem arises from here https://github.com/nuclear-multimessenger-astronomy/nmma/blob/4ba91655a509bce6286574d3c5ad4b764788686c/nmma/em/model.py#L251
Even the tests with Bu2022Ye
fails (here)
@sahiljhawar @bfhealy Maybe it's time to deprecate that pathway?
So now, we just need {model-name}.pkl
and {model-name_filter}.pkl
or {model-name_filter}.h5
, right?
@sahiljhawar That's correct for files on Zenodo! When they get downloaded via NMMA code, the filter-specific files get renamed to {filter}.pkl
or {filter}.h5
and put in a model-specific directory.
@bfhealy For that, in one of the previous meetings it was finalised that a mirror of Zenodo models will be stored on Potsdam server for faster and easier acess. This way the directory structure can be preserved
But in any case, the support for *_mag.pkl
and *_lbol.pkl
should be removed, since we do not have those files anymore
Also does the model.pkl
and filter.h5
(stored inside model
folder) serve different purpose? Or do they contain same data with different file structure?
They serve different purposes - each model.pkl
file contains basic SVD information while the filter.h5
files contain the trained neural network weights/biases mapping merger parameters to light curve eigenvalues (in tensorflow format).
Describe the bug When loading the svd models from Bu2022Ye.pkl, the unpickling fails because and returns a warning: pickle.UnpicklingERror: invalid loadkye, '\x0a'. This happens when different protocols are used for pickling and unpickling.
To Reproduce Start a multimessenger analysis with GW, KN and GRB. NMMA current commit.