Open cammy-mun opened 2 years ago
Hi,
I've never used trainer before. Based on the results, I would say there must be something wrong with either ground truth summary or generated summary, at least one of them is empty. I will suggest you check the outputs, and try to run a sanity check before fine-tuning.
Hi, I am trying to finetune PRIMERA from huggingface using trainer, with a new dataset. However, i keep getting rouge scores of 0. May I know which part of the code is wrong?
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from transformers import Seq2SeqTrainer, Seq2SeqTrainingArguments
import nltk
import numpy as np
TOKENIZER = AutoTokenizer.from_pretrained("allenai/PRIMERA")
MODEL = AutoModelForSeq2SeqLM.from_pretrained("allenai/PRIMERA")
import torch
MODEL.gradient_checkpointing_enable()
PAD_TOKEN_ID = TOKENIZER.pad_token_id
DOCSEP_TOKEN_ID = TOKENIZER.convert_tokens_to_ids("<doc-sep>")
from huggingface_hub import notebook_login
notebook_login()
here i load my own reformatted version of the multi_news dataset from huggingface - format is a (src,tgt) pair, where src is the related documents, tgt is the summary. its almost the same as the original multi_news dataset, just that i added a few more words at the front along with |||||.
train = load_dataset('cammy/multi_news_formatted_small', split='train[:100]', use_auth_token=True, cache_dir="D:")
valid = load_dataset('cammy/multi_news_formatted_small', split='valid[:10]', use_auth_token=True, cache_dir="D:")
test = load_dataset('cammy/multi_news_formatted_small', split='test[:10]', use_auth_token=True, cache_dir="D:")
then i do the preprocessing of data
then lastly:
trainer.train()
but these are the results:
Hi friend, I'm also trying to fine-tune the model with my own dataset, is the trainer problem solved yet?
Can you please provide scripts for fine-tuning PRIMER on a new dataset? Details on that are scarce. By this I mean, could you add a bash script that would fine-tune on any dataset
Follow-up question, why does your dataset always require a fine-tuned model as one of the arguments. From what I gather, the argument model path either expects a model fine-tuned on (multi news, arxiv, etc) or the default which longfomer_summ_multinews. If we are fine-tuning, shouldn't the primer pre-trained model suffice??
Hi, I also am attempting to explore the pretarined model and see if I can fine tune it for another dataset. I ran into an error like this while trying to finetune PRIMERA on the sample wcep dataset.
File "__/software/Miniconda/lib/python3.6/runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "__/software/Miniconda/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "__/software/code-server/lib/vscode/extensions_omni/ms-python.python-2020.11.371526539/pythonFiles/lib/python/debugpy/__main__.py", line 45, in <module>
cli.main()
File "__/software/code-server/lib/vscode/extensions_omni/ms-python.python-2020.11.371526539/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 430, in main
run()
File "__/software/code-server/lib/vscode/extensions_omni/ms-python.python-2020.11.371526539/pythonFiles/lib/python/debugpy/../debugpy/server/cli.py", line 267, in run_file
runpy.run_path(options.target, run_name=compat.force_str("__main__"))
File "__/software/Miniconda/lib/python3.6/runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "__/software/Miniconda/lib/python3.6/runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "__/software/Miniconda/lib/python3.6/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "__/work/instance1/jupyter/PRIMER_train/script/primer_main.py", line 792, in <module>
train(args)
File "__/work/instance1/jupyter/PRIMER_train/script/primer_main.py", line 528, in train
trainer.fit(model, train_dataloader, valid_dataloader)
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 460, in fit
self._run(model)
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 758, in _run
self.dispatch()
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 799, in dispatch
self.accelerator.start_training(self)
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 96, in start_training
self.training_type_plugin.start_training(trainer)
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 144, in start_training
self._results = trainer.run_stage()
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 809, in run_stage
return self.run_train()
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 844, in run_train
self.run_sanity_check(self.lightning_module)
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 1112, in run_sanity_check
self.run_evaluation()
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/trainer/trainer.py", line 967, in run_evaluation
output = self.evaluation_loop.evaluation_step(batch, batch_idx, dataloader_idx)
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/trainer/evaluation_loop.py", line 174, in evaluation_step
output = self.trainer.accelerator.validation_step(args)
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/accelerators/accelerator.py", line 226, in validation_step
return self.training_type_plugin.validation_step(*args)
File "__/software/Miniconda/lib/python3.6/site-packages/pytorch_lightning/plugins/training_type/training_type_plugin.py", line 161, in validation_step
return self.lightning_module.validation_step(*args, **kwargs)
File "__/work/instance1/jupyter/PRIMER_train/script/primer_main.py", line 261, in validation_step
loss = self.shared_step(input_ids, output_ids)
File "__/work/instance1/jupyter/PRIMER_train/script/primer_main.py", line 145, in shared_step
lm_logits = self.forward(input_ids, output_ids)
File "__/work/instance1/jupyter/PRIMER_train/script/primer_main.py", line 114, in forward
use_cache=False,
File "__/software/Miniconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "__/software/Miniconda/lib/python3.6/site-packages/transformers/models/bart/modeling_bart.py", line 1295, in forward
return_dict=return_dict,
File "__/software/Miniconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "__/software/Miniconda/lib/python3.6/site-packages/transformers/models/bart/modeling_bart.py", line 1157, in forward
return_dict=return_dict,
File "__/software/Miniconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "__/software/Miniconda/lib/python3.6/site-packages/transformers/models/bart/modeling_bart.py", line 796, in forward
output_attentions=output_attentions,
File "__/software/Miniconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "__/software/Miniconda/lib/python3.6/site-packages/transformers/models/bart/modeling_bart.py", line 309, in forward
output_attentions=output_attentions,
File "__/software/Miniconda/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
TypeError: forward() got an unexpected keyword argument 'hidden_states'
I am able to run the notebook (Evaluation_Example.ipynb) given in repo to give me a few extracted sentences for a few samples.
Library Versions used:
pytorch_lightning==1.3.8
torchmetrics==0.6.2
datasets==1.6.0
spacy==2.3.5
nltk==3.6.1
tqdm==4.49.0
rouge-score
torch==1.10.2
transformers==4.3.0
Did anyone else get this error? If so how did you solve it? Or am I incorrect on library versioning here?
Thanks!
I may have figured out a way to solve my problem and be able to train PRIMERA on a dataset.
The issue arose as the code uses the longformer library that is slightly out of sync with the transformers version of the same.
So instead of using imports
from longformer import LongformerEncoderDecoderForConditionalGeneration, LongformerEncoderDecoderConfig
switch to
from transformers import LEDForConditionalGeneration, LEDConfig
and correspondingly in the code.
This helped me train the model.
Hi Jainesh, would you be okay with sharing the repository for the changed code?
Hi, Jainesh, I'm also working on the training process recently but still couldn't find any methods. Would you please release your modified code?
Hi all,
If you want to fine-tune PRIMERA on new datasets, I would suggest you use the hugging face version of PRIMERA and check out the file 'script/primera_hf_main.py'
(as the original Longformer package is out of sync). To use PRIMERA-hf, you can install the latest version of huggingface transformer, and import the model as
from transformers import (
AutoTokenizer,
LEDConfig,
LEDForConditionalGeneration,
)
tokenizer = AutoTokenizer.from_pretrained('allenai/PRIMERA')
config=LEDConfig.from_pretrained('allenai/PRIMERA')
model = LEDForConditionalGeneration.from_pretrained('allenai/PRIMERA')
I have not used Trainer
in huggingface before, so I'm not sure what the problem would be. You may also consider using pytorch_lightning
, which is used in 'script/primera_hf_main.py'
. If you want to use Trainer
, you can still refer to the training part in the script to see how to use the model to train. As for the evaluation, you can check the notebook Evaluation_Example.ipynb
is the max length input and output variable based on the target dataset, i.e. if training on a new dataset, do we assign these values ourselves from summary stats?
Hi! I modified the official run_summarization.py
script from HuggingFace and was able to fine-tune PRIMERA models with it. Figured I would share that script if it's useful to anyone else: https://gist.github.com/JohnGiorgi/8c7dcabd3ee8a362b9174c5d145029ab.
The main differences are:
max_length // num_docs
global_attention_mask
to the model_inputs
, which is 1 for the bos_token
and special "<doc-sep>"
token, but 0 elsewhere.You use the script the same way as the original run_summarization.py
script, except you provide "allenai/PRIMERA-*"
as the model_name_or_path
.
Hi @JohnGiorgi, did you encounter a problem where all the predictions becomes ""
after a few hundreds steps of fine-tuning? I met this problem with both allenai/led-large-16384
and allenai/PRIMERA
. (My issue was exactly the same with this one: https://github.com/huggingface/transformers/issues/18190)
I did have the issue quite a while ago, and it has disappeared for me. There was a bug a while back where the Seq2SeqTrainer
function was not taking into account the global_attention_mask
which may have been the problem? Might be worth updating transformers
to the latest version (if you haven't already) and trying again.
Hi @JohnGiorgi, thanks for your reply! However I am still having this problem of running your provided script (https://gist.github.com/JohnGiorgi/8c7dcabd3ee8a362b9174c5d145029ab) with the newest version of transformers==4.21.0.dev0
. I used the following command to run (on a 8*32GB V100 EC2 instance):
python run_summarization.py \
--model_name_or_path allenai/PRIMERA \
--do_train \
--do_eval \
--dataset_name multi_news \
--dataset_config "3.0.0" \
--source_prefix "summarize: " \
--output_dir ./outputs \
--per_device_train_batch_size=4 \
--per_device_eval_batch_size=4 \
--overwrite_output_dir \
--predict_with_generate
The evaluation results are:
***** eval metrics *****
epoch = 3.0
eval_gen_len = 128.0
eval_loss = 2.0331
eval_rouge1 = 0.0
eval_rouge2 = 0.0
eval_rougeL = 0.0
eval_rougeLsum = 0.0
eval_runtime = 0:11:05.88
eval_samples = 5621
eval_samples_per_second = 8.441
eval_steps_per_second = 0.264
Not sure what causes this problem but there must still be something wrong with the generation
method in huggingface implementations. But anyway, thanks much for your script and it is really helpful.
Hi, I am trying to finetune PRIMERA from huggingface using trainer, with a new dataset. However, i keep getting rouge scores of 0. May I know which part of the code is wrong?
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from transformers import Seq2SeqTrainer, Seq2SeqTrainingArguments
import nltk
import numpy as np
TOKENIZER = AutoTokenizer.from_pretrained("allenai/PRIMERA")
MODEL = AutoModelForSeq2SeqLM.from_pretrained("allenai/PRIMERA")
import torch
MODEL.gradient_checkpointing_enable()
PAD_TOKEN_ID = TOKENIZER.pad_token_id
DOCSEP_TOKEN_ID = TOKENIZER.convert_tokens_to_ids("<doc-sep>")
from huggingface_hub import notebook_login
notebook_login()
here i load my own reformatted version of the multi_news dataset from huggingface - format is a (src,tgt) pair, where src is the related documents, tgt is the summary. its almost the same as the original multi_news dataset, just that i added a few more words at the front along with |||||.
train = load_dataset('cammy/multi_news_formatted_small', split='train[:100]', use_auth_token=True, cache_dir="D:")
valid = load_dataset('cammy/multi_news_formatted_small', split='valid[:10]', use_auth_token=True, cache_dir="D:")
test = load_dataset('cammy/multi_news_formatted_small', split='test[:10]', use_auth_token=True, cache_dir="D:")
then i do the preprocessing of data
then lastly:
trainer.train()
but these are the results: