Closed lcl6679292 closed 4 years ago
If you want to look the source code used for training the model, you can look at the source GitHub, in particular you can view the src/train.py
, src/train_abstractive.py
or src/train_extractive.py
Python scripts.
@TheEdoardo93 Thank you for your reply. I know, will you plan to integrate the source training code into transformers? It is more convenient to use your transformers code for training.
At the moment, I think that it is not on the roadmap. Do you have a particular reason for asking to integrate the training algorithm into this library?
@TheEdoardo93 Thank you for your reply. I know, will you plan to integrate the source training code into transformers? It is more convenient to use your transformers code for training.
@TheEdoardo93 I think this is a good encoder-decoder framework based on BERT. In addition to the summary task, it can also do many other generation tasks. If the training code can be integrated into this library, it can be used to finetune more downstream generation tasks. I think this library currently lacks downstream fine-tuning for NLG tasks, such like query generation, generative reading comprehension and other summarization tasks.
Thanks for the help. How do I load the checkpoints model_step_20000.pt that was trained from src/train.py to replace model= BertAbs.from_pretrained("bertabs-finetuned-cnndm")
If you want to look the source code used for training the model, you can look at the source GitHub, in particular you can view the
src/train.py
,src/train_abstractive.py
orsrc/train_extractive.py
Python scripts.
Hello! As I know, you can't load a PyTorch checkpoint directly in BertAbs
model, you'll indeed get an error. A PyTorch checkpoint typically contains the model state dict. Therefore, you can try to use the following source code for your task:
> import transformers
> import torch
> from transformers import BertTokenizer
> tokenizer = BertTokenizer.from_pretrained('bert-base-uncased', do_lower_case=True)
> from modeling_bertabs import BertAbs
> model = BertAbs.from_pretrained('bertabs-finetuned-cnndm')
> model.load_state_dict(torch.load(PATH_TO_PT_CHECKPOINT))
where _PATH_TO_PTCHECKPOINT could be e.g. _./input_checkpoints/model_step20000.pt.
N.B: this code would work only in the case where the architecture of bertabs-finetuned-cnndm
model is equal to the one you're trying to load into, otherwise an error occur!
If this code doesn't work as expected, we can work together in order to solve your problem :)
Thanks for the help. How do I load the checkpoints model_step_20000.pt that was trained from src/train.py to replace model= BertAbs.from_pretrained("bertabs-finetuned-cnndm")
If you want to look the source code used for training the model, you can look at the source GitHub, in particular you can view the
src/train.py
,src/train_abstractive.py
orsrc/train_extractive.py
Python scripts.
Its Important!! ADD IT.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
@TheEdoardo93 is there any way to load a pretrained model with different architecture? I used the source library to train a model with source embedding size of 1024 instead of 512 as in the pretrained one as 512 was too small for my data.
❓ Questions & Help
Thank you very much for your wonderful work. I found that some new code for summarization has been added from "pretrained encoder" paper. However, I see only the evaluation part of the code. I want to ask if you will add the code for the training part. Thank you very much!