nyu-dl / dl4chem-mgm

BSD 3-Clause "New" or "Revised" License
69 stars 10 forks source link

Cannot run generate.py #1

Closed e-yi closed 3 years ago

e-yi commented 3 years ago

The program ran into RuntimeError frequently. Could you please fix them?

omarnmahmood commented 3 years ago

Could you please give an example of the script you ran and the traceback you received?

e-yi commented 3 years ago

I did not modified anything but ran your demo command:

python generate.py --data_path data/QM9/QM9_processed.p --graph_type QM9 --model_path dumped/QM9_experiment/best_model --smiles_dataset_path data/QM9_smiles.txt --output_dir dumped/QM9_experiment/generation/train_init/mask10/results --num_node_types 5 --num_edge_types 5 --max_nodes 9 --layer_norm --embed_hs --spatial_msg_res_conn --num_iters 400 --num_sampling_iters 400 --cp_save_dir dumped/QM9_experiment/generation/train_init/mask10/generation_checkpoints --batch_size 2500 --checkpointing_period 400 --evaluation_period 20 --save_period 20 --evaluate_finegrained --save_finegrained --mask_independently --retrieve_train_graphs --node_target_frac 0.1 --edge_target_frac 0.1

I have fixed a few bugs and managed to run the code. But I guess it's still worth posting a issue as there may be something that I missed. I believe that the occurrence of these bugs are due to the lack of testing, and the ones that I found was pretty obvious.

hmzhulalala commented 3 years ago

Could you please provide the training set and test set of QM9? Thank you!

omarnmahmood commented 3 years ago

The QM9 SMILES strings can be found at data/QM9/QM9_smiles.txt . The dataset for MGM can be generated from this file using the instructions given in the README file.

hmzhulalala commented 3 years ago

The QM9 SMILES strings can be found at data/QM9/QM9_smiles.txt . The dataset for MGM can be generated from this file using the instructions given in the README file.

ok, thank you!

omarnmahmood commented 3 years ago

Fixed bugs that I found so closing this issue.

e-yi commented 3 years ago

Thanks for your work, @omarnmahmood. But are you able to run your code with only those modification?

https://github.com/nyu-dl/dl4chem-mgm/blob/1bdaf9f83d1abcedf1bcb036ee22e7ead16d76bd/src/model/graph_generator.py line 299-300

        for i, num_nodes in enumerate(batch_init_graph.batch_num_nodes):
            num_edges = batch_init_graph.batch_num_edges[i]

should be

        for i, num_nodes in enumerate(batch_init_graph.batch_num_nodes()):
            num_edges = batch_init_graph.batch_num_edges()[i] 

Besides, MAT parameters are not needed by EdgesFromNodesMPNN or EdgesOwnRepsMPNN. Maybe it's better if you just remove some redundant code.

omarnmahmood commented 3 years ago

Hi @e-yi you are right, this error is caused by a change in the dgl api between dgl versions. I have updated the code as you suggested. MAT parameters are there for potential future work but I may remove them if they are not needed.