facebookresearch / RLCompOpt

Learning Compiler Pass Orders using Coreset and Normalized Value Prediction. (ICML 2023)
MIT License
17 stars 2 forks source link

Error during training: TypeError in GNN model #1

Open chhnb opened 6 months ago

chhnb commented 6 months ago

Sure, here's a polished version of the issue:


Issue Description:

I have followed the instructions provided in the readme documentation to configure the environment and prepare the data. However, when I attempt to train the GNN (Graph Neural Network) model, I encounter the following error:

Error executing job with overrides: ['model.gnn_type=', 'seed=0']
Traceback (most recent call last):
  File "rlcompopt/train.py", line 1210, in main
    main_real(args)
  File "rlcompopt/train.py", line 1497, in main_real
    last_loss, eval_loss = policy_gradient(
  File "rlcompopt/train.py", line 541, in policy_gradient
    loss, kl, y_pred, logits = model(
  File "/home/chh/anaconda3/envs/rlcompopt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/chh/anaconda3/envs/rlcompopt/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1040, in forward
    output = self._run_ddp_forward(*inputs, **kwargs)
  File "/home/chh/anaconda3/envs/rlcompopt/lib/python3.8/site-packages/torch/nn/parallel/distributed.py", line 1000, in _run_ddp_forward
    return module_to_run(*inputs[0], **kwargs[0])
  File "/home/chh/anaconda3/envs/rlcompopt/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1194, in _call_impl
    return forward_call(*input, **kwargs)
  File "/home/chh/anaconda3/envs/rlcompopt/lib/python3.8/site-packages/rlcompopt-0.0.1-py3.8.egg/rlcompopt/cl/models/gnn_pyg.py", line 921, in forward
    preds, _ = super().forward(
  File "/home/chh/anaconda3/envs/rlcompopt/lib/python3.8/site-packages/rlcompopt-0.0.1-py3.8.egg/rlcompopt/cl/models/gnn_pyg.py", line 553, in forward
    res, edge_feat, res_mid, edge_feat_mid = self.layers_encode(
  File "/home/chh/anaconda3/envs/rlcompopt/lib/python3.8/site-packages/rlcompopt-0.0.1-py3.8.egg/rlcompopt/cl/models/gnn_pyg.py", line 695, in layers_encode
    n_layers = len(layers)
TypeError: object of type 'NoneType' has no len()

Steps to Reproduce:

  1. Configure the environment as per the documentation.
  2. Prepare the data according to the instructions.
  3. Execute the training script for the GNN model.

Expected Behavior:

The GNN model should train without encountering any errors.

Additional Information:

Please let me know if any further information is needed to diagnose and resolve this issue. Thank you.


Feel free to adjust any details or add more specific information as needed.

youweiliang commented 5 months ago

Hi @chhnb , It seems that the default configurations were overridden with a missing argument. Specifically, in the error message ['model.gnn_type=', 'seed=0'], no gnn_type was provided. Which script did you run that produced this error?

chhnb commented 5 months ago

I just ran the script using bash scripts/train_graph_gnn_type2_nvp.sh.