Closed rdyzakya closed 2 years ago
Hi, you can try to change the model_name in the config file to the specific BERT model name that you need (the huggingface name)
Okay thanks, will try them. But I have another issue when I try to change the save_dir of the model.
This also happens when I use the englsih version and I change the random seed
Hi, this may be due to prediction failure of the model which leads to no prediction file, could you see if there is any error message further up and attach the full error log?
I thinkk this is the main reason that the model.tar.gz file is not created
2022-08-08 20:34:14,521 - CRITICAL - root - Uncaught exception
Traceback (most recent call last):
File "/home/randy.suchrady/anaconda3/envs/span-aste/bin/allennlp", line 8, in <module>
sys.exit(run())
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/__main__.py", line 34, in run
main(prog="allennlp")
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/commands/__init__.py", line 118, in main
args.func(args)
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/commands/train.py", line 119, in train_model_from_args
file_friendly_logging=args.file_friendly_logging,
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/commands/train.py", line 178, in train_model_from_file
file_friendly_logging=file_friendly_logging,
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/commands/train.py", line 242, in train_model
file_friendly_logging=file_friendly_logging,
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/commands/train.py", line 466, in _train_worker
metrics = train_loop.run()
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/commands/train.py", line 528, in run
return self.trainer.train()
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/training/trainer.py", line 966, in train
return self._try_train()
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/training/trainer.py", line 1001, in _try_train
train_metrics = self._train_epoch(epoch)
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/training/trainer.py", line 716, in _train_epoch
batch_outputs = self.batch_outputs(batch, for_training=True)
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/allennlp/training/trainer.py", line 604, in batch_outputs
output_dict = self._pytorch_model(**batch)
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/srv/nas_data1/text/randy/finetune-cxa/Span-ASTE/span_model/models/span_model.py", line 374, in forward
metadata,
File "/home/randy.suchrady/anaconda3/envs/span-aste/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/srv/nas_data1/text/randy/finetune-cxa/Span-ASTE/span_model/models/relation_proper.py", line 223, in forward
relation_scores = self._compute_relation_scores(pruned_o, pruned_t)
File "/srv/nas_data1/text/randy/finetune-cxa/Span-ASTE/span_model/models/relation_proper.py", line 454, in _compute_relation_scores
embeds = self._compute_span_pair_embeddings(a, b)
File "/srv/nas_data1/text/randy/finetune-cxa/Span-ASTE/span_model/models/relation_proper.py", line 422, in _compute_span_pair_embeddings
c = self._make_pair_features(a, b)
File "/srv/nas_data1/text/randy/finetune-cxa/Span-ASTE/span_model/models/relation_proper.py", line 418, in _make_pair_features
x = torch.cat(features, dim=-1)
RuntimeError: CUDA out of memory. Tried to allocate 16.00 MiB (GPU 0; 31.75 GiB total capacity; 28.63 GiB already allocated; 10.06 MiB free; 30.24 GiB reserved in total by PyTorch)
I think this is outputted after the shell.run function that is trying to train the weights. The problem is, I already check the gpu memory that is available, and there is enough space for 16MB memory.
Hi, this may be a GPU issue if you are sharing with other people and programs that are also using GPU memory, or it may be a dataset issue if the long input texts cause the transformer model to use large amounts of GPU memory. In the case of long input texts, you may need to truncate the texts or split into multiple sentences.
Hi, I have some questions (sorry if this is some kind of beginners question, I am new in this field). I want to change the word embedder to the BERT that is pretrained with my language (Indonesia, using indobert). Can you give some tips on how to change the embedder to my language? Thanks!