dingo-gw / dingo

Dingo: Deep inference for gravitational-wave observations
MIT License
57 stars 20 forks source link

Toy example: Inference crashes #259

Open marvinschmitt opened 1 day ago

marvinschmitt commented 1 day ago

Hi,

In an effort to implement a new generative model backbone in Dingo, I tried to run the toy example in a fresh environment to take it from there. The data generation and training stages run without any issues, but the inference process crashes with an error when instantiating the GenerationNode in dingo.pipe.dag_creator:generate_dag.

Here's my full environment setup and all CLI calls (conda/mamba installation on a Mac M1 FWIW):

conda create --name dingo_new 
conda activate dingo_new
mamba install dingo-gw

cd toy_npe_model
dingo_generate_dataset --settings waveform_dataset_settings.yaml --out_file training_data/waveform_dataset.hdf5
dingo_generate_asd_dataset --settings_file asd_dataset_settings.yaml --data_dir training_data/asd_dataset

dingo_train --settings_file train_settings.yaml --train_dir training
dingo_pipe GW150914.ini\n

The full error trace:

(dingo_new) marvin@marvin-work toy_npe_model % dingo_pipe GW150914.ini                                             

09:53 dingo_pipe INFO    : Loading dingo model from training/model_latest.pt in order to access settings.
/Users/marvin/miniforge3/envs/dingo_new/lib/python3.11/site-packages/dingo/core/models/posterior_model.py:256: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
  d = torch.load(model_filename, map_location=device)
09:53 dingo_pipe INFO    : Setting analysis request_cpus_importance_sampling = 2
09:53 dingo_pipe INFO    : PSD duration set to 512.0s, 128x the duration 4.0s
09:53 dingo_pipe INFO    : Setting segment trigger-times [1126259462.4]
Traceback (most recent call last):
  File "/Users/marvin/miniforge3/envs/dingo_new/bin/dingo_pipe", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/marvin/miniforge3/envs/dingo_new/lib/python3.11/site-packages/dingo/pipe/main.py", line 338, in main
    generate_dag(inputs, model_args)
  File "/Users/marvin/miniforge3/envs/dingo_new/lib/python3.11/site-packages/dingo/pipe/dag_creator.py", line 67, in generate_dag
    generation_node = GenerationNode(inputs, **kwargs)
                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/marvin/miniforge3/envs/dingo_new/lib/python3.11/site-packages/dingo/pipe/nodes/generation_node.py", line 12, in __init__
    super().__init__(inputs, **kwargs)
  File "/Users/marvin/miniforge3/envs/dingo_new/lib/python3.11/site-packages/bilby_pipe/job_creation/nodes/generation_node.py", line 28, in __init__
    if not inputs.osg and inputs.generation_pool == "igwn-pool":
                          ^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'MainInput' object has no attribute 'generation_pool'

I'd appreciate it if you could look into this. Thanks!

Best, Marvin

stephengreen commented 1 day ago

Hi Marvin,

I believe this is fixed in #257. I have now made a new release v0.6.1 that incorporates this PR. This release is on PyPI, but it will take a few hours to appear on conda-forge. The problem is that when the dependency bilby_pipe gets updated, it often breaks our inference code, and we have to make an update to Dingo to fix it.

Please let me know if this fixes the issue.

Also note that we will very soon update Dingo with a refactoring of the PosteriorModel class, including FMPE. It may be easier to adapt this when implementing your new backbone.

Best regards, Stephen

marvinschmitt commented 1 day ago

Hi Stephen,

Great, thanks a lot for your prompt reply, and for shipping the new release right away. I'll let you know whether it fixes the issue when conda-forge updates (conda installs are less error-prone on my Mac).

Also note that we will very soon update Dingo with a refactoring of the PosteriorModel class, including FMPE. It may be easier to adapt this when implementing your new backbone.

Thanks for the heads-up! After talking to @max-dax earlier this week, I'm building my implementation on the FMPE branch and will adjust it to your upcoming refactor once that's released. I can't delay my initial implementation because I need a working version by the NeurIPS camera-ready deadline end of October :)

Thanks again, and let's stay in touch!

Best, Marvin