Open tarun3300 opened 2 years ago
Faced the same issue. While training.
Can you try further decreasing the mini-batch size?
Can you try further decreasing the mini-batch size?
This method doesn't work for me.
When we are trying to run the greaselm.py we are getting this issue even if we run the batch size minimum of 8
we tried from 128-8 every time, It throws the error with different memory size as free , after some epochs. can you help us here in solving this issue and run the code
logits, _ = model(*[x[a:b] for x in input_data]) File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 85, in forward logits, attn = self.lmgnn(lm_inputs, concept_ids, File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 217, in forward outputs, gnn_output = self.mp(input_ids, token_type_ids, attention_mask, output_mask, gnn_input, adj, node_type_ids, node_scores, special_nodes_mask, output_hidden_$ File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 411, in forward encoder_outputs, _X = self.encoder(embedding_output, File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_greaselm.py", line 815, in forward _X = self.gnn_layers[gnn_layer_index](_X, edge_index, edge_type, _node_type, _node_feature_extra) File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/scratch/users/tgudela/greaseLM/GreaseLM-main/modeling/modeling_gnn.py", line 91, in forward aggr_out = self.propagate(edge_index, x=x, edge_attr=edge_embeddings) #[N, emb_dim] File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 261, in propagate coll_dict = self.__collect__(self.__user_args__, edge_index, size, File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 171, in _collect_ data = self.__lift__(data, edge_index, File "/home/t/tgudela/.conda/envs/greaselm2/lib/python3.8/site-packages/torch_geometric/nn/conv/message_passing.py", line 141, in _lift_ return src.index_select(self.node_dim, index) RuntimeError: CUDA out of memory. Tried to allocate 144.00 MiB (GPU 0; 15.78 GiB total capacity; 14.28 GiB already allocated; 133.50 MiB free; 14.39 GiB reserved in tot
oh, the totally same problem, have u guys solved this?
When we are trying to run the greaselm.py we are getting this issue even if we run the batch size minimum of 8
we tried from 128-8 every time, It throws the error with different memory size as free , after some epochs. can you help us here in solving this issue and run the code