lazaratan / dyn-gfn

DynGFN: Bayesian Dynamic Causal Discovery using Generative Flow Networks
MIT License
51 stars 13 forks source link

Error about running code #8

Closed oyxy2019 closed 1 year ago

oyxy2019 commented 1 year ago

hello, when I run the code by: python train.py trainer.gpus=0 Terminal output:

Could not override 'trainer.gpus'. To append to your config use +trainer.gpus=0 Key 'gpus' is not in struct full_key: trainer.gpus object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

so I try to run: python train.py +trainer.gpus=0 Terminal output:

... [2023-03-07 17:35:43,726][src.training_pipeline][INFO] - Instantiating datamodule [2023-03-07 17:35:43,938][src.datamodules.simulated_datamodule][INFO] - Loading data from /home/itcast/Desktop/dyn-gfn-main/dyn-gfn-main/data/UnidentifiableSimulatedVelocityDataModule/linear-7-1000-100-0.0-0.99-0.05-0.0-{}-0-0.pt A [[-0.01 0. 0. 0. 0. -0. -0. ] [ 0. -0.01 0. 0. 0. 0. 0. ] [ 0. 0. -0.01 -0. 0. 0. -0. ] [ 0. 0. -0. -0.01 -0. -0. -0. ] [ 0. 0. 0. -0. -0.01 -0. -0. ] [-0. 0. 0. -0. -0. -0.01 -0. ] [-0. 0. -0. -0. -0. -0. -0.01]] Error executing job with overrides: ['+trainer.gpus=0'] Traceback (most recent call last): File "train.py", line 22, in main return train(config) File "/home/itcast/Desktop/dyn-gfn-main/dyn-gfn-main/src/training_pipeline.py", line 48, in train if config.model["c"] == "src.models.velocity_module.TrueGraphLitModule": omegaconf.errors.ConfigKeyError: Key 'c' is not in struct full_key: model.c object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

In addition, I want to know how to use my own dataset.

look forward to your reply! Thanks!

oyxy2019 commented 1 year ago

I use Ubuntu18.04, pycharm, conda, python3.7. And I also tried other platform such as Google-colab, it will output the same error.

oyxy2019 commented 1 year ago

After setting HYDRA_FULL_ERROR=1 The output is:

(notebook) itcast@ubuntu:~/Desktop/dyn-gfn-main/dyn-gfn-main$ python train.py trainer.gpus=0 [2023-03-07 18:01:18,267][torch.distributed.nn.jit.instantiator][INFO] - Created a temporary directory at /tmp/tmp7xmccy95 [2023-03-07 18:01:18,268][torch.distributed.nn.jit.instantiator][INFO] - Writing /tmp/tmp7xmccy95/_remote_module_non_scriptable.py [2023-03-07 18:01:18,314][src.utils][INFO] - Printing config tree with Rich! CONFIG ├── datamodule ... [2023-03-07 18:01:18,367][src.training_pipeline][INFO] - Instantiating datamodule [2023-03-07 18:01:18,445][src.datamodules.simulated_datamodule][INFO] - Loading data from /home/itcast/Desktop/dyn-gfn-main/dyn-gfn-main/data/UnidentifiableSimulatedVelocityDataModule/linear-7-1000-100-0.0-0.99-0.05-0.0-{}-0-0.pt A [[-0.01 0. 0. 0. 0. -0. -0. ] [ 0. -0.01 0. 0. 0. 0. 0. ] [ 0. 0. -0.01 -0. 0. 0. -0. ] [ 0. 0. -0. -0.01 -0. -0. -0. ] [ 0. 0. 0. -0. -0.01 -0. -0. ] [-0. 0. 0. -0. -0. -0.01 -0. ] [-0. 0. -0. -0. -0. -0. -0.01]] Error executing job with overrides: ['trainer.gpus=0'] Traceback (most recent call last): File "train.py", line 26, in main() File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/main.py", line 95, in decorated_main config_name=config_name, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 396, in _run_hydra overrides=overrides, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 453, in _run_app lambda: hydra.run( File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 216, in run_and_report raise ex File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 213, in run_and_report return func() File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 456, in overrides=overrides, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/internal/hydra.py", line 132, in run = ret.return_value File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/core/utils.py", line 260, in return_value raise self._return_value File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "train.py", line 22, in main return train(config) File "/home/itcast/Desktop/dyn-gfn-main/dyn-gfn-main/src/training_pipeline.py", line 48, in train if config.model["c"] == "src.models.velocity_module.TrueGraphLitModule": File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 372, in getitem key=key, value=None, cause=e, type_override=ConfigKeyError File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/base.py", line 237, in _format_and_raise type_override=type_override, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/_utils.py", line 819, in format_and_raise _raise(ex, cause) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/_utils.py", line 797, in _raise raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 369, in getitem return self._get_impl(key=key, default_value=_DEFAULTMARKER) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 443, in _get_impl key=key, throw_on_missing_key=True, validate_key=validate_key File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/basecontainer.py", line 78, in _get_child throw_on_missing_key=throw_on_missing_key, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 475, in _get_node self._validate_get(key) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 165, in _validate_get key=key, value=value, cause=ConfigAttributeError(msg) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/base.py", line 237, in _format_and_raise type_override=type_override, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/_utils.py", line 899, in format_and_raise _raise(ex, cause) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/_utils.py", line 797, in _raise raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace omegaconf.errors.ConfigKeyError: Key 'c' is not in struct full_key: model.c object_type=dict

lazaratan commented 1 year ago

Thanks for bringing this up.

Re item 1 - Running code.

Re item 2 - Using your own dataset.

oyxy2019 commented 1 year ago

Thank you very much for your reply!!! ReRe item 1 - Running code. I tried to run python train.pybefore (without any parameters) , but it‘s still.I think maybe problem of config file. The error message is as follows:

Error executing job with overrides: [] Traceback (most recent call last): File "train.py", line 22, in main return train(config) File "/home/itcast/Desktop/dyn-gfn-main/dyn-gfn-main/src/training_pipeline.py", line 48, in train if config.model["c"] == "src.models.velocity_module.TrueGraphLitModule": omegaconf.errors.ConfigKeyError: Key 'c' is not in struct full_key: model.c object_type=dict

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

I'm so Sorry that I'm a rookie and know little about Hydra.

oyxy2019 commented 1 year ago

And I don't know why yesterday was another error:

[2023-03-08 04:06:52,780][src.training_pipeline][INFO] - Instantiating model Error executing job with overrides: [] Error locating target 'src.models.parallel_energy_gfn_module.PerNodeParallelLinearTrainableCausalGraphGFlowNetModule', see chained exception above. full_key: model

Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.

lazaratan commented 1 year ago

In "dyn-gfn-main/dyn-gfn-main/src/training_pipeline.py", line 48 it should be:

if config.model["_target_"] == "src.models.velocity_module.TrueGraphLitModule":

"_target_" in the model config yaml specifies the which module to use.

As an example, for config/model/linear_tcg.yaml:

_target_: src.models.parallel_energy_gfn_module.ParallelLinearTrainableCausalGraphGFlowNetModule

env_batch_size: 64 eval_batch_size: 5000 uniform_backwards: False hidden_dim: 64 embed_dim: 128 lr: 0.001 full_posterior_eval: False energy_freq: 10 loss_fn: "detailed_balance" arch: "mlp" # options - ['mlp', 'transformer'] confidence: 0.0 n_steps: 0 bias: true

oyxy2019 commented 1 year ago

Thank you for your reply and I see you updated the code.

So it looks like here is seven models:

target: src.models.velocity_module.LinearLitModule target: src.models.node_module.HyperNodeLitModule target: src.models.velocity_module.HyperLitModule target: src.models.parallel_energy_gfn_module.ParallelHyperTrainableCausalGraphGFlowNetModule target: src.models.parallel_energy_gfn_module.ParallelLinearTrainableCausalGraphGFlowNetModule target: src.models.parallel_energy_gfn_module.PerNodeParallelHyperTrainableCausalGraphGFlowNetModule target: src.models.parallel_energy_gfn_module.PerNodeParallelLinearTrainableCausalGraphGFlowNetModule

I want to know the difference between these seven models. Is it mentioned in the paper DynGFN: Bayesian Dynamic Causal Discovery using Generative Flow Networks(2023)?

oyxy2019 commented 1 year ago

And the second problem is:

Now, the default configuration is

But when I run python train.py, The terminal is still reporting error, error message as is as follows:

[2023-03-20 06:40:09,418][src.training_pipeline][INFO] - Instantiating model Error executing job with overrides: [] Traceback (most recent call last): File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 639, in _locate obj = getattr(obj, part) AttributeError: module 'src.models' has no attribute 'parallel_energy_gfn_module'

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 645, in _locate obj = import_module(mod) File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "", line 1006, in _gcd_import File "", line 983, in _find_and_load File "", line 967, in _find_and_load_unlocked File "", line 677, in _load_unlocked File "", line 728, in exec_module File "", line 219, in _call_with_frames_removed File "/home/newsgrid/linyy/dyn-gfn/dyn-gfn-main/src/models/parallel_energy_gfn_module.py", line 26, in from .components.energy import ( File "/home/newsgrid/linyy/dyn-gfn/dyn-gfn-main/src/models/components/energy.py", line 7, in from torchdyn.core import NeuralODE ModuleNotFoundError: No module named 'torchdyn'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 134, in _resolve_target target = _locate(target) File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 651, in _locate ) from exc_import ImportError: Error loading 'src.models.parallel_energy_gfn_module.PerNodeParallelLinearTrainableCausalGraphGFlowNetModule': ModuleNotFoundError("No module named 'torchdyn'") Are you sure that 'parallel_energy_gfn_module' is importable from module 'src.models'?

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "train.py", line 26, in main() File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/main.py", line 95, in decorated_main config_name=config_name, File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 396, in _run_hydra overrides=overrides, File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 453, in _run_app lambda: hydra.run( File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 216, in run_and_report raise ex File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 213, in run_and_report return func() File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 456, in overrides=overrides, File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/internal/hydra.py", line 132, in run = ret.return_value File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/core/utils.py", line 260, in return_value raise self._return_value File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "train.py", line 22, in main return train(config) File "/home/newsgrid/linyy/dyn-gfn/dyn-gfn-main/src/training_pipeline.py", line 90, in train config.model, dm_conf=config.datamodule, recursive=False File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 223, in instantiate config, *args, recursive=recursive, convert=convert, partial=partial File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 325, in instantiate_node target = _resolve_target(node.get(_Keys.TARGET), full_key) File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 139, in _resolve_target raise InstantiationException(msg) from e hydra.errors.InstantiationException: Error locating target 'src.models.parallel_energy_gfn_module.PerNodeParallelLinearTrainableCausalGraphGFlowNetModule', see chained exception above. full_key: model

Looking forward to your reply! Thanks a lot!

lazaratan commented 1 year ago

I want to know the difference between these seven models. Is it mentioned in the paper DynGFN: Bayesian Dynamic Causal Discovery using Generative Flow Networks(2023)?

Yes, in general the paper outlines the differences between the models.

... when I run python train.py, The terminal is still reporting error, error message as is as follows:

This may be a torchdyn version issue. See #9 .

oyxy2019 commented 1 year ago

Thanks! pip install torchdyn==1.0.3 is useful!