huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
134.87k stars 26.98k forks source link

BART for Pre-Training #6743

Closed swashiro closed 2 years ago

swashiro commented 4 years ago

❓ Questions & Help

How can I run BART pre-training? I have data to pre-training(Masked LM)

patrickvonplaten commented 4 years ago

This should help: https://github.com/huggingface/transformers/issues/5096#issuecomment-645860271

patrickvonplaten commented 4 years ago

@sshleifer - think this is the 3rd issue about Bart pre-training -> maybe it would be a good idea to release a small notebook at some point.

sshleifer commented 4 years ago

@patil-suraj you took a stab at this at some point? this may have been optimistic :(

patil-suraj commented 4 years ago

Yes, I was trying to port fairseq dataset here, same for t5, I'll try to focus more on it when I'm done with current PRs, should strat with a notebook as Patrick said, then try to include it in examples/

swashiro commented 4 years ago

@patrickvonplaten Does that mean I can train with Masked-input, input(label) and Decoder-input?

patrickvonplaten commented 4 years ago

yes, this should be possible

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

dhruvramani commented 3 years ago

@patil-suraj any news on the pretraining script for Bart?

prajdabre commented 3 years ago

If anyone wants to train their MBART model then feel free to use this. https://github.com/prajdabre/yanmtt

Contributions are welcome!

thomas-li-sjtu commented 2 years ago

@patil-suraj excuse me, is there any news on the pretraining script for Bart? Thanks.

prajdabre commented 2 years ago

@thomas-li-sjtu you can try my toolkit if you like. It's based on transformers and allows for Bart/mbart pretraining. https://github.com/prajdabre/yanmtt

thomas-li-sjtu commented 2 years ago

@thomas-li-sjtu you can try my toolkit if you like. It's based on transformers and allows for Bart/mbart pretraining. https://github.com/prajdabre/yanmtt

Hi there, here is my problem. I hope to pretrain a bart model based on my own dataset and fine tune it for another task (not nmt). I noticed that your toolkit designs for nmt so maybe it is not the one I need. Anyway, thanks for your reply!

prajdabre commented 2 years ago

@thomas-li-sjtu ok I understand. It's not just designed for NMT (despite its name). I've used it for summarisation and general NLG without problems. Good luck with your search.

thomas-li-sjtu commented 2 years ago

@thomas-li-sjtu ok I understand. It's not just designed for NMT (despite its name). I've used it for summarisation and general NLG without problems. Good luck with your search.

Wow that is awesome. I will try it for my task!

prajdabre commented 2 years ago

@thomas-li-sjtu cool. Feel free to raise issues as it helps me add new functionality that may be of use to people. If you want to know how to use it for summarisation (or generic nlg) then look here: https://github.com/AI4Bharat/indic-bart

patil-suraj commented 2 years ago

Sorry to only come back to this issue now. If anyone is interested in adding this example script in Transformers, I would be more than happy to help :)

For BART pre-training we need the text-infilling + sentence-permutation data collator which you could find here https://github.com/morganmcg1/rotobart/blob/main/data_collator.py#L223

With this collator you could then modify and use run_summarization.py script here https://github.com/huggingface/transformers/tree/master/examples/pytorch/summarization.

Let me know if anyone is interested. :) cc @patrickvonplaten

Eurus-W commented 2 years ago

Sorry to only come back to this issue now. If anyone is interested in adding this example script in Transformers, I would be more than happy to help :)

For BART pre-training we need the text-infilling + sentence-permutation data collator which you could find here https://github.com/morganmcg1/rotobart/blob/main/data_collator.py#L223

With this collator you could then modify and use run_summarization.py script here https://github.com/huggingface/transformers/tree/master/examples/pytorch/summarization.

Let me know if anyone is interested. :) cc @patrickvonplaten

I think the BART pre-training script is very useful for my work and many others. It is generous of you to add this example script in 'Transfromers' !!!

Eurus-W commented 2 years ago

Sorry to only come back to this issue now. If anyone is interested in adding this example script in Transformers, I would be more than happy to help :)

For BART pre-training we need the text-infilling + sentence-permutation data collator which you could find here https://github.com/morganmcg1/rotobart/blob/main/data_collator.py#L223

With this collator you could then modify and use run_summarization.py script here https://github.com/huggingface/transformers/tree/master/examples/pytorch/summarization.

Let me know if anyone is interested. :) cc @patrickvonplaten

Thanks for your reply and I think your method is absolutely feasible. But when I try it , I faced some errors that I can't fix. And could you please give me some help? Here is my changes to run_summarization.py(tag 4.11.0)

  1. Import some necessary packages in https://github.com/morganmcg1/rotobart/blob/main/data_collator.py#L223
  2. Add full codes of DataCollatorForDenoisingTasks and also let class DataCollatorForDenoisingTasks inherit class DataCollatorForSeq2Seq in this way: class DataCollatorForDenoisingTasks(DataCollatorForSeq2Seq):
  3. Use the new collator: data_collator = DataCollatorForSeq2Seq(......) -> data_collator = DataCollatorForDenoisingTasks(.......)

Run the changed script and I get errors below.

Traceback (most recent call last): File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/IPython/core/interactiveshell.py", line 3457, in run_code exec(code_obj, self.user_global_ns, self.user_ns) File "", line 1, in runfile('/data/whq/tmp/SBartTry/fineBartPretrain.py', args=['--model_name_or_path', 'facebook/bart-base', '--do_train', '--do_eval', '--train_file', '/data/whq/tmp/SBartTry/tryData/clickbait_train.csv', '--validation_file', '/data/whq/tmp/SBartTry/tryData/clickbait_valid.csv', '--source_prefix', '', '--num_train_epochs=3', '--output_dir', '/data/whq/tmp/SBartTry/fineBartPretrain/clickbait', '--overwrite_output_dir', '--per_device_train_batch_size=16', '--per_device_eval_batch_size=16', '--predict_with_generate'], wdir='/data/whq/tmp/SBartTry') File "/home/whq/.pycharm_helpers/pydev/_pydev_bundle/pydev_umd.py", line 198, in runfile pydev_imports.execfile(filename, global_vars, local_vars) # execute the script File "/home/whq/.pycharm_helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile exec(compile(contents+"\n", file, 'exec'), glob, loc) File "/data/whq/tmp/SBartTry/fineBartPretrain.py", line 823, in main() File "/data/whq/tmp/SBartTry/fineBartPretrain.py", line 745, in main train_result = trainer.train(resume_from_checkpoint=checkpoint) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/transformers/trainer.py", line 1325, in train tr_loss_step = self.training_step(model, inputs) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/transformers/trainer.py", line 1884, in training_step loss = self.compute_loss(model, inputs) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/transformers/trainer.py", line 1916, in compute_loss outputs = model(inputs) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 168, in forward outputs = self.parallel_apply(replicas, inputs, kwargs) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 178, in parallel_apply return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)]) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply output.reraise() File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/torch/_utils.py", line 434, in reraise raise exception TypeError: Caught TypeError in replica 0 on device 0. Original Traceback (most recent call last): File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker output = module(input, kwargs) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, kwargs) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/transformers/models/bart/modeling_bart.py", line 1336, in forward return_dict=return_dict, File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(*input, *kwargs) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/transformers/models/bart/modeling_bart.py", line 1200, in forward return_dict=return_dict, File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl return forward_call(input, kwargs) File "/home/whq/anaconda3/envs/pytorchenv/lib/python3.7/site-packages/transformers/models/bart/modeling_bart.py", line 769, in forward input_shape = input_ids.size() TypeError: 'int' object is not callable

Waiting for your generous reply! @patil-suraj

OllieBroadhurst commented 2 years ago

@Eurus-W make sure you convert the numpy arrays in the batch returned by data_collator() into tensors. batch["input_ids"] = torch.LongTensor(batch["input_ids"]), for example.