emmaking-smith / SET_LSF_CODE

The code corresponding to Predictive Minisci Late Stage Functionalization with Transfer Learning
https://www.nature.com/articles/s41467-023-42145-1?fbclid=IwAR1RnBIirqNfBQUdKE__VOmHBt-yPwn4ilsaX4ZmxzTiBkfHDdR0p82_bX8
MIT License
12 stars 3 forks source link

Code reproduction, training and pre-training code errors #2

Open eat-sugar opened 2 weeks ago

eat-sugar commented 2 weeks ago

Hi, I've been reproducing your work recently, and I'm glad the prediction code works fine, but I ran into a few issues when running the training code, and I look forward to your response! 1. (1) I tried to run python LSF_Finetune.py -s {PATH TO SAVE FILE} and the first error was from NMR_Pretrain_Net import NMR_MPNN ModuleNotFoundError: No module named 'NMR_Pretrain_Net' (2) I saw that other files used from neural_nets.NMR_4ll_Net import NMR_MPNN. After modifying and running again, the error became Traceback (most recent call last): File "LSF_Finetune.py", line 326, in <module> main() File "LSF_Finetune.py", line 222, in main Y_pred = model(h, g, rxn_vector) File "/home/datahouse1/wangqinggong/anaconda3/envs/lsf/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/datahouse1/wangqinggong/SET_LSF_CODE-master/set_lsf/neural_nets/LSF_Finetune_Net.py", line 124, in forward gru_atom_type_subset = self.update_func[idx](h_atom_type_subset, m_atom_type_subset) TypeError: 'GRUCell' object is not subscriptable (3) I used the Dependencies you provided, and tried to create different anaconda environments on different servers and use different Dependencies versions, but the problem still cannot be solved. (4) Is the training code you provided here the final executable version, or are there any requirements for my server and environment?

2. Similarly, I also encountered problems when running python NMR_Pretrain.py -s {PATH TO SAVE FILE}.

Here, the number of data contained in the NMR pre-training file data/13C_nmrshiftdb.pickle is [5390 rows x 4 columns], 'MAX_N': 64, 'spectra_config': [('13C', 'dataset.named/spectra.nmrshiftdb_13C.feather')], 'tgt_nucs': ['13C']}, but an error was reported after running it.

How can I modify it? The error message is as follows: Traceback (most recent call last): File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3361, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 76, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 108, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 2131, in pandas._libs.hashtable.Int64HashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 2140, in pandas._libs.hashtable.Int64HashTable.get_item KeyError: 5961 The above exception was the direct cause of the following exception: Traceback (most recent call last): File "NMR_Pretrain.py", line 223, in <module> main() File "NMR_Pretrain.py", line 140, in main for i, (g, h, Y) in enumerate(dataloader_train): File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 517, in __next__ data = self._next_data() File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 557, in _next_data data = self._dataset_fetcher.fetch(index) # may raise StopIteration File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp> data = [self.dataset[idx] for idx in possibly_batched_index] File "/home/data-house-01/wangqinggong/SET_LSF_CODE-Published/set_lsf/utils/NMR_Pretrain_Setup.py", line 535, in __getitem__ Y = self.df.loc[index, 'Y'] File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/pandas/core/indexing.py", line 925, in __getitem__ return self._getitem_tuple(key) File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/pandas/core/indexing.py", line 1100, in _getitem_tuple return self._getitem_lowerdim(tup) File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/pandas/core/indexing.py", line 838, in _getitem_lowerdim section = self._getitem_axis(key, axis=i) File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/pandas/core/indexing.py", line 1164, in _getitem_axis return self._get_label(key, axis=axis) File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/pandas/core/indexing.py", line 1113, in _get_label return self.obj.xs(label, axis=axis) File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/pandas/core/generic.py", line 3776, in xs loc = index.get_loc(key) File "/home/data-house-01/wangqinggong/anaconda3/envs/GCN_new/lib/python3.7/site-packages/pandas/core/indexes/base.py", line 3363, in get_loc raise KeyError(key) from err KeyError: 5961 Also errors are reported at different index positions KeyError: 1925

emmaking-smith commented 1 week ago

Hello @eat-sugar ,

Both the finetuning and pretraining should work now.