Open SejeongPark8354 opened 3 years ago
I am also getting the above issue - Did you manage to find a fix @SejeongPark8354 ?
getting a very similar issue when running train_generator.py
:
Namespace(anneal_iter=25000, anneal_rate=0.9, atom_vocab=<hgraph.vocab.Vocab object at 0x000001C10639ED48>, batch_size=20, clip_norm=5.0, depthG=15, depthT=15, diterG=3, diterT=1, dropout=0.0, embed_size=250, epoch=20, hidden_size=125, kl_anneal_iter=2000, latent_size=32, load_model=None, lr=0.001, max_beta=1.0, print_iter=50, rnn_type='LSTM', save_dir='ckpt/cyclic_truncated_pretrained', save_iter=5000, seed=7, step_beta=0.001, train='train_processed/cyclic_truncated_processed/', vocab='data/chembl/cyclic_peptide_vocab_truncated.txt', warmup=10000)
C:\Users\Marshall\Anaconda3\envs\hgraph-rdkit\lib\site-packages\torch\nn\_reduction.py:42: UserWarning: size_average and reduce args will be deprecated, please use reduction='sum' instead.
warnings.warn(warning.format(ret))
Model #Params: 1318K
0%|▏ | 2/1000 [00:32<4:01:50, 14.54s/it]C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [40,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [41,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [42,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [43,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [44,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [45,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [46,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [47,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [48,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [49,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [50,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [51,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [52,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [53,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [54,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [55,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [56,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [57,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [58,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [146,0,0], thread: [59,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [44,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [45,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [46,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [47,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [48,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [49,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [50,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [51,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [52,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [53,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [54,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [55,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [56,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [57,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [58,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [59,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [60,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [61,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [62,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
C:\cb\pytorch_1000000000000\work\aten\src\ATen\native\cuda\Indexing.cu:975: block: [148,0,0], thread: [63,0,0] Assertion “srcIndex < srcSelectDimSize” failed.
0%|▏ | 2/1000 [00:35<4:56:07, 17.80s/it]
Traceback (most recent call last):
File "train_generator.py", line 92, in <module>
loss, kl_div, wacc, iacc, tacc, sacc = model(*batch, beta=beta)
File "C:\Users\Marshall\Anaconda3\envs\hgraph-rdkit\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\Marshall\hgraph2graph-master\hgraph\hgnn.py", line 55, in forward
root_vecs, tree_vecs, _, graph_vecs = self.encoder(tree_tensors, graph_tensors)
File "C:\Users\Marshall\Anaconda3\envs\hgraph-rdkit\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "C:\Users\Marshall\hgraph2graph-master\hgraph\encoder.py", line 129, in forward
tensors = self.embed_graph(graph_tensors)
File "C:\Users\Marshall\hgraph2graph-master\hgraph\encoder.py", line 114, in embed_graph
fpos = self.E_apos.index_select(index=fmess[:, 3], dim=0)
RuntimeError: CUDA error: device-side assert triggered
Actually, I think I figured it out. There's a parameter defined in mol_graph.py
, MAX_POS = 20
, which limits the E_apos matrix, E_pos matrix, and subsequently when in the enconder, the f_mess matrix will be out of index which is why you get the error.
I think it's an issue of molecule size and graph complexity - in the paper, there's a subscript: "The number of possible attachments are limited because the number of attaching atoms between two motifs is small and the attaching points must be consecutive.3
3In our experiments, the number of possible attachments are usually less than 20 for polymers and small molecules."
I agree with the above person's advice. I first use "_os.environ['CUDA_LAUNCHBLOCKING'] = '1'" to locate the bug, I find there are some problem with "_fpos = self.E_apos.indexselect(index=fmess[:, 3], dim=0)". And then I use the slice to locate where the error is,I find the max number of fmess[:,3] is 22 while self.E_apos only has 20 dims. So I increase the MAX_POS in mol_graph.py and solve this problem. I think the operation would not affect the models, maybe waste some memory.
First of all, Thank you for your great research on molecule generation. Nowadays, I am training my ZINC datasets with your vae_train.py (in generation folder). When I run the code, I got the error like below. This error occur occasionally. I think it depends on the batch. Is there any solution for this problem?