huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
https://huggingface.co/transformers
Apache License 2.0
135.4k stars 27.09k forks source link

Improve EncoderDecoderModel docs #16135

Open patrickvonplaten opened 2 years ago

patrickvonplaten commented 2 years ago

First good issue

There have been quite some issues/questions with how to use the Encoder-Decoder model, e.g.: https://github.com/huggingface/transformers/issues/4483 and https://github.com/huggingface/transformers/issues/15479 . The main reason for this is that the model docs are quite outdated and we could need a nice How-to-guide.

So I think we have two action items here:

  1. Improve https://huggingface.co/docs/transformers/v4.17.0/en/model_doc/encoder-decoder#encoder-decoder-models a.k.a.: https://github.com/huggingface/transformers/blob/master/docs/source/model_doc/encoder-decoder.mdx

We should mention here: a) How to create a model ? We should show how to use the from_encoder_decoder_pretrained(...) and then how to save the model? b) How to fine-tune this model? We should mention that this model can then be fine-tuned just like any other encoder-decoder model (Bart, T5, ...) c) Put a big warning that the config values have to be correctly set and how to set them, e.g. read: https://github.com/huggingface/transformers/issues/15479

This should be an EncoderDecoderModel specific text and be very concise and short.

In a second step, we should then write a How-to-guide that includes much more details.

More than happy to help someone tackle this first good issue

silvererudite commented 2 years ago

Hi...I would love to contribute to this.

patrickvonplaten commented 2 years ago

Awesome! Would you like to open a PR and give it a try? :-) I think it would be great if we could put some example code on how to create an EncoderDecoderModel on this model doc: https://github.com/huggingface/transformers/blob/master/docs/source/model_doc/encoder-decoder.mdx which will then be displayed here: https://huggingface.co/docs/transformers/v4.17.0/en/model_doc/encoder-decoder#encoder-decoder-models :-)

Let me know if you have any questions! Happy to help :-)

silvererudite commented 2 years ago

Yes..definitely...Will open a PR shortly and ask for help when I'm stuck..thanks a lot.

Threepointone4 commented 2 years ago

Hi @patrickvonplaten , I would love to contribute to this.

Threepointone4 commented 2 years ago

@patrickvonplaten , I have created the fork and added some docs.

So I think we have two action items here:

  1. Improve https://huggingface.co/docs/transformers/v4.17.0/en/model_doc/encoder-decoder#encoder-decoder-models a.k.a.: https://github.com/huggingface/transformers/blob/master/docs/source/model_doc/encoder-decoder.mdx

We should mention here: a) How to create a model ? We should show how to use the from_encoder_decoder_pretrained(...) and then how to save the model? b) How to fine-tune this model? We should mention that this model can then be fine-tuned just like any other encoder-decoder model (Bart, T5, ...)

I have added some documentation, let me know what do you think about this.

c) Put a big warning that the config values have to be correctly set and how to set them, e.g. read: #15479

I didn't got chance to go through this, I will try to cover it this week

In a second step, we should then write a How-to-guide that includes much more details.

I have added a colab which has detailed explanation of encoder decoder model and how to train it. Does that help for this?

patrickvonplaten commented 2 years ago

Hey @Threepointone4, that's great!

Could you maybe open a PR for:

We should mention here:
a) How to create a model ? We should show how to use the from_encoder_decoder_pretrained(...) and then how to save the model?
b) How to fine-tune this model? We should mention that this model can then be fine-tuned just like any other encoder-decoder model (Bart, T5, ...)

? :-)

Threepointone4 commented 2 years ago

@patrickvonplaten I have created the PR and done the changes based on my understanding. Please let me know if some changes are required.

Winterflower commented 2 years ago

Hello all, I'm very much a beginner in this space, so please excuse the potentially stupid question. I have been experimenting with rolling out my own encoder-decoder combinations for use with the VisionEncoderDecoder class as specified in the docs here

The VisionEncoderDecoderModel can be used to initialize an image-to-text model with any pretrained Transformer-based vision model as the encoder ...

but I keep running into the issue of getting this error message

The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results. Setting pad_token_id to eos_token_id:2 for open-end generation.

Based on reading the docs, I am not entirely sure if I need to specifically finetune an encode-decoder combination on the image-to-text downstream task and the error message above is due to that or if I can just pre-trained configurations without finetuning. Perhaps I could open a PR with some docs suggestions?

patrickvonplaten commented 2 years ago

Hey @Winterflower,

Could you please try to use the forum instead for such questions: https://discuss.huggingface.co/ ? :-) Thank you!

anishlukk123 commented 1 year ago

is this issue still open because we i would like to take to solve this problem

ghost commented 1 year ago

I would like to contribute

SHUBHAPRIYA95 commented 1 year ago

Hi ,if this issue is still open i would love to contribute.

rajveer43 commented 1 year ago

Hi @patrickvonplaten , Is it still open I want to work on this!

patrickvonplaten commented 1 year ago

Sure maybe you can browse https://huggingface.co/docs/transformers/v4.31.0/en/model_doc/encoder-decoder#overview and check if there is anything we can improve

rajveer43 commented 1 year ago

thanks will do check

riiyaa24 commented 1 year ago

Hello, I would like to contribute to this issue? Can you assign me this PR

mhdirnjbr commented 1 year ago

Hello @patrickvonplaten !

This is my very first time deciding to contribute to open source projects inspired by my participation in the Hugging Face event in Paris and the insightful conversations I had with the project maintainers.

As a final-year graduate student in Math and AI, I am eager to explore opportunities to collaborate on this issue. I would greatly appreciate it if you could provide more information on how I can get involved.

Thank you in advance.

lappemic commented 6 months ago

It feels like this issue was addressed and closed by PR #17815?

Ryukijano commented 1 month ago

I would love to contribute to this!