Closed oyxy2019 closed 1 year ago
I use Ubuntu18.04, pycharm, conda, python3.7. And I also tried other platform such as Google-colab, it will output the same error.
After setting HYDRA_FULL_ERROR=1 The output is:
(notebook) itcast@ubuntu:~/Desktop/dyn-gfn-main/dyn-gfn-main$ python train.py trainer.gpus=0 [2023-03-07 18:01:18,267][torch.distributed.nn.jit.instantiator][INFO] - Created a temporary directory at /tmp/tmp7xmccy95 [2023-03-07 18:01:18,268][torch.distributed.nn.jit.instantiator][INFO] - Writing /tmp/tmp7xmccy95/_remote_module_non_scriptable.py [2023-03-07 18:01:18,314][src.utils][INFO] - Printing config tree with Rich!
CONFIG ├── datamodule ... [2023-03-07 18:01:18,367][src.training_pipeline][INFO] - Instantiating datamodule [2023-03-07 18:01:18,445][src.datamodules.simulated_datamodule][INFO] - Loading data from /home/itcast/Desktop/dyn-gfn-main/dyn-gfn-main/data/UnidentifiableSimulatedVelocityDataModule/linear-7-1000-100-0.0-0.99-0.05-0.0-{}-0-0.pt A [[-0.01 0. 0. 0. 0. -0. -0. ] [ 0. -0.01 0. 0. 0. 0. 0. ] [ 0. 0. -0.01 -0. 0. 0. -0. ] [ 0. 0. -0. -0.01 -0. -0. -0. ] [ 0. 0. 0. -0. -0.01 -0. -0. ] [-0. 0. 0. -0. -0. -0.01 -0. ] [-0. 0. -0. -0. -0. -0. -0.01]] Error executing job with overrides: ['trainer.gpus=0'] Traceback (most recent call last): File "train.py", line 26, in main() File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/main.py", line 95, in decorated_main config_name=config_name, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 396, in _run_hydra overrides=overrides, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 453, in _run_app lambda: hydra.run( File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 216, in run_and_report raise ex File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 213, in run_and_report return func() File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/_internal/utils.py", line 456, in overrides=overrides, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/internal/hydra.py", line 132, in run = ret.return_value File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/core/utils.py", line 260, in return_value raise self._return_value File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "train.py", line 22, in main return train(config) File "/home/itcast/Desktop/dyn-gfn-main/dyn-gfn-main/src/training_pipeline.py", line 48, in train if config.model["c"] == "src.models.velocity_module.TrueGraphLitModule": File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 372, in getitem key=key, value=None, cause=e, type_override=ConfigKeyError File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/base.py", line 237, in _format_and_raise type_override=type_override, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/_utils.py", line 819, in format_and_raise _raise(ex, cause) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/_utils.py", line 797, in _raise raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 369, in getitem return self._get_impl(key=key, default_value=_DEFAULTMARKER) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 443, in _get_impl key=key, throw_on_missing_key=True, validate_key=validate_key File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/basecontainer.py", line 78, in _get_child throw_on_missing_key=throw_on_missing_key, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 475, in _get_node self._validate_get(key) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/dictconfig.py", line 165, in _validate_get key=key, value=value, cause=ConfigAttributeError(msg) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/base.py", line 237, in _format_and_raise type_override=type_override, File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/_utils.py", line 899, in format_and_raise _raise(ex, cause) File "/home/itcast/anaconda3/envs/notebook/lib/python3.7/site-packages/omegaconf/_utils.py", line 797, in _raise raise ex.with_traceback(sys.exc_info()[2]) # set env var OC_CAUSE=1 for full trace omegaconf.errors.ConfigKeyError: Key 'c' is not in struct full_key: model.c object_type=dict
Thanks for bringing this up.
Re item 1 - Running code.
For CPU usage use: python train.py
For GPU usage use: python train.py trainer=gpu —> This selects the gpu setting under configs/trainer/gpu.yaml. I will update the repo README to reflect this change.
Re item 2 - Using your own dataset.
Thank you very much for your reply!!!
ReRe item 1 - Running code.
I tried to run python train.py
before (without any parameters) ,
but it‘s still.I think maybe problem of config file.
The error message is as follows:
Error executing job with overrides: [] Traceback (most recent call last): File "train.py", line 22, in main return train(config) File "/home/itcast/Desktop/dyn-gfn-main/dyn-gfn-main/src/training_pipeline.py", line 48, in train if config.model["c"] == "src.models.velocity_module.TrueGraphLitModule": omegaconf.errors.ConfigKeyError: Key 'c' is not in struct full_key: model.c object_type=dict
Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
I'm so Sorry that I'm a rookie and know little about Hydra.
And I don't know why yesterday was another error:
[2023-03-08 04:06:52,780][src.training_pipeline][INFO] - Instantiating model
Error executing job with overrides: [] Error locating target 'src.models.parallel_energy_gfn_module.PerNodeParallelLinearTrainableCausalGraphGFlowNetModule', see chained exception above. full_key: model Set the environment variable HYDRA_FULL_ERROR=1 for a complete stack trace.
In "dyn-gfn-main/dyn-gfn-main/src/training_pipeline.py", line 48 it should be:
if config.model["_target_"] == "src.models.velocity_module.TrueGraphLitModule":
"_target_" in the model config yaml specifies the which module to use.
As an example, for config/model/linear_tcg.yaml:
_target_: src.models.parallel_energy_gfn_module.ParallelLinearTrainableCausalGraphGFlowNetModule
env_batch_size: 64 eval_batch_size: 5000 uniform_backwards: False hidden_dim: 64 embed_dim: 128 lr: 0.001 full_posterior_eval: False energy_freq: 10 loss_fn: "detailed_balance" arch: "mlp" # options - ['mlp', 'transformer'] confidence: 0.0 n_steps: 0 bias: true
Thank you for your reply and I see you updated the code.
So it looks like here is seven models:
target: src.models.velocity_module.LinearLitModule target: src.models.node_module.HyperNodeLitModule target: src.models.velocity_module.HyperLitModule target: src.models.parallel_energy_gfn_module.ParallelHyperTrainableCausalGraphGFlowNetModule target: src.models.parallel_energy_gfn_module.ParallelLinearTrainableCausalGraphGFlowNetModule target: src.models.parallel_energy_gfn_module.PerNodeParallelHyperTrainableCausalGraphGFlowNetModule target: src.models.parallel_energy_gfn_module.PerNodeParallelLinearTrainableCausalGraphGFlowNetModule
I want to know the difference between these seven models. Is it mentioned in the paper DynGFN: Bayesian Dynamic Causal Discovery using Generative Flow Networks(2023)?
And the second problem is:
Now, the default configuration is
But when I run python train.py
, The terminal is still reporting error, error message as is as follows:
[2023-03-20 06:40:09,418][src.training_pipeline][INFO] - Instantiating model
Error executing job with overrides: [] Traceback (most recent call last): File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 639, in _locate obj = getattr(obj, part) AttributeError: module 'src.models' has no attribute 'parallel_energy_gfn_module' During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 645, in _locate obj = import_module(mod) File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/importlib/init.py", line 127, in import_module return _bootstrap._gcd_import(name[level:], package, level) File "
", line 1006, in _gcd_import File " ", line 983, in _find_and_load File " ", line 967, in _find_and_load_unlocked File " ", line 677, in _load_unlocked File " ", line 728, in exec_module File " ", line 219, in _call_with_frames_removed File "/home/newsgrid/linyy/dyn-gfn/dyn-gfn-main/src/models/parallel_energy_gfn_module.py", line 26, in from .components.energy import ( File "/home/newsgrid/linyy/dyn-gfn/dyn-gfn-main/src/models/components/energy.py", line 7, in from torchdyn.core import NeuralODE ModuleNotFoundError: No module named 'torchdyn' The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 134, in _resolve_target target = _locate(target) File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 651, in _locate ) from exc_import ImportError: Error loading 'src.models.parallel_energy_gfn_module.PerNodeParallelLinearTrainableCausalGraphGFlowNetModule': ModuleNotFoundError("No module named 'torchdyn'") Are you sure that 'parallel_energy_gfn_module' is importable from module 'src.models'?
The above exception was the direct cause of the following exception:
Traceback (most recent call last): File "train.py", line 26, in
main() File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/main.py", line 95, in decorated_main config_name=config_name, File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 396, in _run_hydra overrides=overrides, File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 453, in _run_app lambda: hydra.run( File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 216, in run_and_report raise ex File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 213, in run_and_report return func() File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/utils.py", line 456, in overrides=overrides, File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/internal/hydra.py", line 132, in run = ret.return_value File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/core/utils.py", line 260, in return_value raise self._return_value File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/core/utils.py", line 186, in run_job ret.return_value = task_function(task_cfg) File "train.py", line 22, in main return train(config) File "/home/newsgrid/linyy/dyn-gfn/dyn-gfn-main/src/training_pipeline.py", line 90, in train config.model, dm_conf=config.datamodule, recursive=False File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 223, in instantiate config, *args, recursive=recursive, convert=convert, partial=partial File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 325, in instantiate_node target = _resolve_target(node.get(_Keys.TARGET), full_key) File "/home/newsgrid/anaconda3/envs/lyy/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 139, in _resolve_target raise InstantiationException(msg) from e hydra.errors.InstantiationException: Error locating target 'src.models.parallel_energy_gfn_module.PerNodeParallelLinearTrainableCausalGraphGFlowNetModule', see chained exception above. full_key: model
Looking forward to your reply! Thanks a lot!
I want to know the difference between these seven models. Is it mentioned in the paper DynGFN: Bayesian Dynamic Causal Discovery using Generative Flow Networks(2023)?
Yes, in general the paper outlines the differences between the models.
... when I run python train.py, The terminal is still reporting error, error message as is as follows:
This may be a torchdyn version issue. See #9 .
Thanks!
pip install torchdyn==1.0.3
is useful!
hello, when I run the code by:
python train.py trainer.gpus=0
Terminal output:so I try to run:
python train.py +trainer.gpus=0
Terminal output:In addition, I want to know how to use my own dataset.
look forward to your reply! Thanks!