ai4co / rl4co

A PyTorch library for all things Reinforcement Learning (RL) for Combinatorial Optimization (CO)
https://rl4.co
MIT License
455 stars 84 forks source link

RL4CO\rl4co-main\notebooks\tutorials\3-change-encoder.ipynb ,The file is not functioning properly #121

Closed lihaoya5 closed 8 months ago

lihaoya5 commented 9 months ago

Describe the bug

I used python 3.10 and downloaded pip install rl4co and pip install torch_geometric and the error occurred as follows:

Change the Encoder:Error

AttributeError Traceback (most recent call last) Cell In[6], line 12 3 from rl4co.models.nn.graph.mpnn import MessagePassingEncoder 5 gcn_encoder = GCNEncoder( 6 env_name='cvrp', 7 embedding_dim=128, 8 num_nodes=20, 9 num_layers=3, 10 ) ---> 12 mpnn_encoder = MessagePassingEncoder( 13 env_name='cvrp', 14 embedding_dim=128, 15 num_nodes=20, 16 num_layers=3, 17 ) 19 model = AttentionModel( 20 env, 21 baseline='rollout', (...) 26 } 27 ) 29 trainer = RL4COTrainer( 30 max_epochs=3, # few epochs for demo 31 accelerator='gpu', 32 devices=1, 33 logger=False, 34 )

File ~.conda\envs\rlco\lib\site-packages\rl4co\models\nn\graph\mpnn.py:100, in MessagePassingEncoder.init(self, env_name, embedding_dim, num_nodes, num_layers, init_embedding, aggregation, self_loop, residual) 96 self.edge_index = torch.permute(torch.nonzero(adj_matrix), (1, 0)) 98 # Init message passing models 99 self.mpnn_layers = nn.ModuleList( --> 100 [ 101 MessagePassingLayer( 102 node_indim=embedding_dim, 103 node_outdim=embedding_dim, 104 edge_indim=1, 105 edgeoutdim=1, 106 aggregation=aggregation, 107 residual=residual, 108 ) 109 for in range(num_layers) 110 ] 111 ) 113 # Record parameters 114 self.self_loop = self_loop

File ~.conda\envs\rlco\lib\site-packages\rl4co\models\nn\graph\mpnn.py:101, in (.0) 96 self.edge_index = torch.permute(torch.nonzero(adj_matrix), (1, 0)) 98 # Init message passing models 99 self.mpnn_layers = nn.ModuleList( 100 [ --> 101 MessagePassingLayer( 102 node_indim=embedding_dim, 103 node_outdim=embedding_dim, 104 edge_indim=1, 105 edgeoutdim=1, 106 aggregation=aggregation, 107 residual=residual, 108 ) 109 for in range(num_layers) 110 ] 111 ) 113 # Record parameters 114 self.self_loop = self_loop

File ~.conda\envs\rlco\lib\site-packages\rl4co\models\nn\graph\mpnn.py:29, in MessagePassingLayer.init(self, node_indim, node_outdim, edge_indim, edge_outdim, aggregation, residual, mlp_params) 19 def init( 20 self, 21 node_indim, (...) 27 mlp_params, 28 ): ---> 29 super(MessagePassingLayer, self).init(aggr=aggregation) 30 # Init message passing models 31 self.edge_model = MLP( 32 input_dim=edge_indim + 2 * node_indim, output_dim=edge_outdim, **mlp_params 33 )

File ~.conda\envs\rlco\lib\site-packages\torch_geometric\nn\conv\message_passing.py:170, in MessagePassing.init(self, aggr, aggr_kwargs, flow, node_dim, decomposed_layers) 168 if not self.propagate.module.startswith(jinja_prefix): 169 if self.inspector.can_read_source: --> 170 module = module_from_template( 171 module_name=f'{jinja_prefix}_propagate', 172 template_path=osp.join(root_dir, 'propagate.jinja'), 173 tmp_dirname='message_passing', 174 # Keyword arguments: 175 module=self.module, 176 collect_name='collect', 177 signature=self._get_propagate_signature(), 178 collect_param_dict=self.inspector.get_flat_param_dict( 179 ['message', 'aggregate', 'update']), 180 message_args=self.inspector.get_param_names('message'), 181 aggregate_args=self.inspector.get_param_names('aggregate'), 182 message_and_aggregate_args=self.inspector.get_param_names( 183 'message_and_aggregate'), 184 update_args=self.inspector.get_param_names('update'), 185 fuse=self.fuse, 186 ) 188 # Cache to potentially disable later on: 189 self.class._orig_propagate = self.class.propagate

File ~.conda\envs\rlco\lib\site-packages\torch_geometric\template.py:37, in module_from_template(module_name, template_path, tmp_dirname, **kwargs) 35 sys.modules[module_name] = module 36 assert spec.loader is not None ---> 37 spec.loader.exec_module(module) 38 return module

File :883, in exec_module(self, module)

File :241, in _call_with_frames_removed(f, *args, **kwds)

File ~.cache\pyg\message_passing\rl4co.models.nn.graph.mpnn_MessagePassingLayer_propagate.py:25 21 from torch_geometric.utils.sparse import ptr2index 22 from torch_geometric.typing import SparseTensor ---> 25 class CollectArgs(NamedTuple): 26 edge_features: torch._VariableFunctionsClass.tensor 27 index: Tensor

File ~.cache\pyg\message_passing\rl4co.models.nn.graph.mpnn_MessagePassingLayer_propagate.py:26, in CollectArgs() 25 class CollectArgs(NamedTuple): ---> 26 edge_features: torch._VariableFunctionsClass.tensor 27 index: Tensor 28 ptr: typing.Optional[Tensor]

File ~\AppData\Roaming\Python\Python310\site-packages\torch__init.py:1833, in getattr__(name) 1830 import importlib 1831 return importlib.import_module(f".{name}", name) -> 1833 raise AttributeError(f"module '{name}' has no attribute '{name}'")

AttributeError: module 'torch' has no attribute '_VariableFunctionsClass'

How should I fix this error?

cbhua commented 9 months ago

Hi @lihaoya5, could you share your Python, RL4CO, PyTorch, and PyG versions? I tested the notebooks/tutorials/3-change-encoder.ipynb with

and it passed. Also could you provide a minimum code to reproduce the bug? I suspect that this error is caused by a mismatch in the package version.

lihaoya5 commented 9 months ago

Thanks for the reply, I'll try your version. next, I share my configuration,I tested the notebooks/tutorials/3-change-encoder.ipynb with Python 3.10.13 RL4CO 0.3.0 torch 2.2.1+cu118 torch-geometric 2.5.0 I have a couple of questions:

  1. When I pip install RL4CO== 0.3.0, the torch version is installed with 2.2.1 by default.
  2. I looked at the Geometric library(https://github.com/lgray/pytorch_geometric),before pip Geometric, 4 packages needed to be installed.They are torch-scatter,torch-sparse,torch-cluster,and torch-spline.Do you need to install these 4 packages, can you pip install torch_geometric directly?
  3. When I run 1-quickstart.ipynb and 1-training-loop-advanced.ipynb with the above configuration, I don't get an error,But when I run 4-search-methods.ipynb and 3-change-encoder.ipynb, the error is generated.
  4. Finally, let me show me my GPU configuration. Sun Mar 3 15:16:40 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 537.13 Driver Version: 537.13 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name TCC/WDDM | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3050 ... WDDM | 00000000:01:00.0 On | N/A | | N/A 35C P8 3W / 75W | 1152MiB / 4096MiB | 1% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+
fedebotu commented 9 months ago

@lihaoya5 could you update the rl4co version to the latest one? You can do with pip install --upgrade rl4co

Also could you report further versions by running the following script?

python -c "import rl4co, torch, lightning, torchrl, tensordict, numpy, sys; print('RL4CO:', \
 rl4co.__version__, '\nPyTorch:', torch.__version__, '\nPyTorch Lightning:', \
lightning.__version__, '\nTorchRL:',  torchrl.__version__, '\nTensorDict:',\
 tensordict.__version__, '\nNumpy:', numpy.__version__, '\nPython:', \
sys.version, '\nPlatform:', sys.platform)"
lihaoya5 commented 9 months ago

I use the conda list command in the conda environment to check the version of the package as follows:

packages in environment at C:\Users\qian.conda\envs\rl4co:

#

Name Version Build Channel

aiohttp 3.9.3 pypi_0 pypi aiosignal 1.3.1 pypi_0 pypi antlr4-python3-runtime 4.9.3 pypi_0 pypi appdirs 1.4.4 pypi_0 pypi asttokens 2.4.1 pypi_0 pypi async-timeout 4.0.3 pypi_0 pypi attrs 23.2.0 pypi_0 pypi bzip2 1.0.8 h2bbff1b_5 https://mirrors.ustc.edu.cn/anaconda/pkgs/main ca-certificates 2023.12.12 haa95532_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main certifi 2024.2.2 pypi_0 pypi cfgv 3.4.0 pypi_0 pypi charset-normalizer 3.3.2 pypi_0 pypi click 8.1.7 pypi_0 pypi cloudpickle 3.0.0 pypi_0 pypi colorama 0.4.6 pypi_0 pypi colorlog 6.8.2 pypi_0 pypi comm 0.2.1 pypi_0 pypi contourpy 1.2.0 pypi_0 pypi cycler 0.12.1 pypi_0 pypi debugpy 1.8.1 pypi_0 pypi decorator 5.1.1 pypi_0 pypi distlib 0.3.8 pypi_0 pypi docker-pycreds 0.4.0 pypi_0 pypi einops 0.7.0 pypi_0 pypi exceptiongroup 1.2.0 pypi_0 pypi executing 2.0.1 pypi_0 pypi filelock 3.13.1 pypi_0 pypi fonttools 4.49.0 pypi_0 pypi frozenlist 1.4.1 pypi_0 pypi fsspec 2024.2.0 pypi_0 pypi gitdb 4.0.11 pypi_0 pypi gitpython 3.1.42 pypi_0 pypi hydra-colorlog 1.2.0 pypi_0 pypi hydra-core 1.3.2 pypi_0 pypi identify 2.5.35 pypi_0 pypi idna 3.6 pypi_0 pypi ipykernel 6.29.3 pypi_0 pypi ipython 8.22.1 pypi_0 pypi jedi 0.19.1 pypi_0 pypi jinja2 3.1.3 pypi_0 pypi joblib 1.3.2 pypi_0 pypi jupyter-client 8.6.0 pypi_0 pypi jupyter-core 5.7.1 pypi_0 pypi kiwisolver 1.4.5 pypi_0 pypi libffi 3.4.4 hd77b12b_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main lightning 2.2.0.post0 pypi_0 pypi lightning-utilities 0.10.1 pypi_0 pypi markdown-it-py 3.0.0 pypi_0 pypi markupsafe 2.1.5 pypi_0 pypi matplotlib 3.8.3 pypi_0 pypi matplotlib-inline 0.1.6 pypi_0 pypi mdurl 0.1.2 pypi_0 pypi mpmath 1.3.0 pypi_0 pypi multidict 6.0.5 pypi_0 pypi nest-asyncio 1.6.0 pypi_0 pypi networkx 3.2.1 pypi_0 pypi nodeenv 1.8.0 pypi_0 pypi numpy 1.26.4 pypi_0 pypi omegaconf 2.3.0 pypi_0 pypi openssl 3.0.13 h2bbff1b_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main packaging 23.2 pypi_0 pypi parso 0.8.3 pypi_0 pypi pillow 10.2.0 pypi_0 pypi pip 23.3.1 py310haa95532_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main platformdirs 4.2.0 pypi_0 pypi pre-commit 3.6.2 pypi_0 pypi prompt-toolkit 3.0.43 pypi_0 pypi protobuf 4.25.3 pypi_0 pypi psutil 5.9.8 pypi_0 pypi pure-eval 0.2.2 pypi_0 pypi pygments 2.17.2 pypi_0 pypi pyparsing 3.1.1 pypi_0 pypi pyrootutils 1.0.4 pypi_0 pypi python 3.10.13 he1021f5_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main python-dateutil 2.8.2 pypi_0 pypi python-dotenv 1.0.1 pypi_0 pypi pytorch-lightning 2.2.0.post0 pypi_0 pypi pywin32 306 pypi_0 pypi pyyaml 6.0.1 pypi_0 pypi pyzmq 25.1.2 pypi_0 pypi requests 2.31.0 pypi_0 pypi rich 13.7.1 pypi_0 pypi rl4co 0.3.0 pypi_0 pypi robust-downloader 0.0.2 pypi_0 pypi scikit-learn 1.4.1.post1 pypi_0 pypi scipy 1.12.0 pypi_0 pypi sentry-sdk 1.40.6 pypi_0 pypi setproctitle 1.3.3 pypi_0 pypi setuptools 68.2.2 py310haa95532_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main six 1.16.0 pypi_0 pypi smmap 5.0.1 pypi_0 pypi sqlite 3.41.2 h2bbff1b_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main stack-data 0.6.3 pypi_0 pypi sympy 1.12 pypi_0 pypi tensordict 0.3.1 pypi_0 pypi threadpoolctl 3.3.0 pypi_0 pypi tk 8.6.12 h2bbff1b_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main torch 2.2.1 pypi_0 pypi torch-geometric 2.5.0 pypi_0 pypi torchmetrics 1.3.1 pypi_0 pypi torchrl 0.3.0 pypi_0 pypi tornado 6.4 pypi_0 pypi tqdm 4.66.2 pypi_0 pypi traitlets 5.14.1 pypi_0 pypi typing-extensions 4.10.0 pypi_0 pypi tzdata 2024a h04d1e81_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main urllib3 2.2.1 pypi_0 pypi vc 14.2 h21ff451_1 https://mirrors.ustc.edu.cn/anaconda/pkgs/main virtualenv 20.25.1 pypi_0 pypi vs2015_runtime 14.27.29016 h5e58377_2 https://mirrors.ustc.edu.cn/anaconda/pkgs/main wandb 0.16.3 pypi_0 pypi wcwidth 0.2.13 pypi_0 pypi wheel 0.41.2 py310haa95532_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main xz 5.4.6 h8cc25b3_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main yarl 1.9.4 pypi_0 pypi zlib 1.2.13 h8cc25b3_0 https://mirrors.ustc.edu.cn/anaconda/pkgs/main

The above are all the packages installed in this environment.

lihaoya5 commented 9 months ago

I just tried pip install --upgrade rl4co and I get the following error: my cuda will be uninstalled and the gpu is unusable.

Snipaste_2024-03-03_16-40-04 Snipaste_2024-03-03_16-43-48
cbhua commented 9 months ago

The error reported by pip could be solved by also updating torchaudio and torchvision packages with pip install torchaudio torchvision --upgrade.

About the GPU is unusable, I think this is related with your cuda version, could you share with us the output of running nvidia-smi? (sorry I saw it was shared in previous reply)

lihaoya5 commented 9 months ago

Thank you very much for your answer. pip install torchaudio torchvision --upgrade after the GPU is still unusable, it should be related to my cuda version, I want to try your environment configuration first, if there is still an error, I will ask you again, thank you very much.

fedebotu commented 8 months ago

@lihaoya5 did you manage to fix the problem?

lihaoya5 commented 8 months ago

Yes, I solved the problem following your environment. image