[Community Event] Doc Tests Sprint

patrickvonplaten commented 2 years ago

This issue is part of our Doc Test Sprint. If you're interested in helping out come join us on Discord and talk with other contributors!

Docstring examples are often the first point of contact when trying out a new library! So far we haven't done a very good job at ensuring that all docstring examples work correctly in 🤗 Transformers - but we're now very dedicated to ensure that all documentation examples work correctly by testing each documentation example via Python's doctest (https://docs.python.org/3/library/doctest.html) on a daily basis.

In short we should do the following for all models for both PyTorch and Tensorflow:

- Check the current doc examples will run without failure
- Check whether the current doc example of the forward method is a sensible example to better understand the model or whether it can be improved. E.g. is the example of https://huggingface.co/docs/transformers/v4.17.0/en/model_doc/bert#transformers.BertForQuestionAnswering.forward a good example of the model? Could it be improved?
- Add an expected output to the doc example and test it via Python's doc test (see Guide to contributing below)

Adding a documentation test for a model is a great way to better understand how the model works, a simple (possibly first) contribution to Transformers and most importantly a very important contribution to the Transformers community 🔥

If you're interested in adding a documentation test, please read through the Guide to contributing below.

This issue is a call for contributors, to make sure docstring exmaples of existing model architectures work correctly. If you wish to contribute, reply in this thread which architectures you'd like to take :)

Guide to contributing:

Ensure you've read our contributing guidelines 📜
Claim your architecture(s) in this thread (confirm no one is working on it) 🎯
Implement the changes as in https://github.com/huggingface/transformers/pull/15987 (see the diff on the model architectures for a few examples) 💪
- The file you want to look at is in src/transformers/models/[model_name]/modeling_[model_name].py, src/transformers/models/[model_name]/modeling_tf_[model_name].py or src/transformers/doc_utils.py or src/transformes/file_utils.py
- Make sure to run the doc example doc test locally as described in https://github.com/huggingface/transformers/tree/master/docs#for-python-files
- Optionally, change the example docstring to a more sensible example that gives a better suited result
- Make the test pass
- Add the file name to https://github.com/huggingface/transformers/blob/master/utils/documentation_tests.txt (making sure the file stays in alphabetical order)
- Run the doc example test again locally
In addition, there are a few things we can also improve, for example :
- Fix some style issues: for example, change ``decoder_input_ids``` to `decoder_input_ids`.
- Using a small model checkpoint instead of a large one: for example, change "facebook/bart-large" to "facebook/bart-base" (and adjust the expected outputs if any)
Open the PR and tag me @patrickvonplaten @ydshieh or @patil-suraj (don't forget to run make fixup before your final commit) 🎊
- Note that some code is copied across our codebase. If you see a line like # Copied from transformers.models.bert..., this means that the code is copied from that source, and our scripts will automatically keep that in sync. If you see that, you should not edit the copied method! Instead, edit the original method it's copied from, and run make fixup to synchronize that across all the copies. Be sure you installed the development dependencies with pip install -e ".[dev]", as described in the contributor guidelines above, to ensure that the code quality tools in make fixup can run.

PyTorch Model Examples added to tests:

[ ] ALBERT (@vumichien)
[x] BART (@abdouaziz)
[x] BEiT
[ ] BERT (@vumichien)
[ ] Bert
[ ] BigBird (@vumichien)
[x] BigBirdPegasus
[x] Blenderbot
[x] BlenderbotSmall
[ ] CamemBERT (@abdouaziz)
[ ] Canine (@NielsRogge)
[ ] CLIP (@Aanisha)
[ ] ConvBERT (@simonzli)
[x] ConvNext
[ ] CTRL (@jeremyadamsfisher)
[x] Data2VecAudio
[ ] Data2VecText
[ ] DeBERTa (@Tegzes)
[ ] DeBERTa-v2 (@Tegzes)
[x] DeiT
[ ] DETR
[ ] DistilBERT (@jmwoloso)
[ ] DPR
[ ] ELECTRA (@bhadreshpsavani)
[ ] Encoder
[ ] FairSeq
[ ] FlauBERT (@abdouaziz)
[ ] FNet
[ ] Funnel
[ ] GPT2 (@ArEnSc)
[ ] GPT-J (@ArEnSc)
[x] Hubert
[ ] I-BERT (@abdouaziz)
[ ] ImageGPT
[ ] LayoutLM (chiefchiefling @ discord)
[ ] LayoutLMv2
[ ] LED
[x] Longformer (@KMFODA)
[ ] LUKE (@Tegzes)
[ ] LXMERT
[ ] M2M100
[x] Marian
[x] MaskFormer (@reichenbch)
[x] mBART
[ ] MegatronBert
[ ] MobileBERT (@vumichien)
[ ] MPNet
[ ] mT5
[ ] Nystromformer
[ ] OpenAI
[ ] OpenAI
[x] Pegasus
[ ] Perceiver
[x] PLBart
[x] PoolFormer
[ ] ProphetNet
[ ] QDQBert
[ ] RAG
[ ] Realm
[ ] Reformer
[x] ResNet
[ ] RemBERT
[ ] RetriBERT
[ ] RoBERTa (@patrickvonplaten )
[ ] RoFormer
[x] SegFormer
[x] SEW
[x] SEW-D
[x] SpeechEncoderDecoder
[x] Speech2Text
[x] Speech2Text2
[ ] Splinter
[ ] SqueezeBERT
[x] Swin
[ ] T5 (@MarkusSagen)
[ ] TAPAS (@NielsRogge)
[ ] Transformer-XL (@simonzli)
[ ] TrOCR (@arnaudstiegler)
[x] UniSpeech
[x] UniSpeechSat
[x] Van
[x] ViLT
[x] VisionEncoderDecoder
[ ] VisionTextDualEncoder
[ ] VisualBert
[x] ViT
[x] ViTMAE
[x] Wav2Vec2
[x] WavLM
[ ] XGLM
[ ] XLM
[ ] XLM-RoBERTa (@AbinayaM02)
[ ] XLM-RoBERTa-XL
[ ] XLMProphetNet
[ ] XLNet
[ ] YOSO

Tensorflow Model Examples added to tests:

[ ] ALBERT (@vumichien)
[ ] BART
[ ] BEiT
[ ] BERT (@vumichien)
[ ] Bert
[ ] BigBird (@vumichien)
[ ] BigBirdPegasus
[ ] Blenderbot
[ ] BlenderbotSmall
[ ] CamemBERT
[ ] Canine
[ ] CLIP (@Aanisha)
[ ] ConvBERT (@simonzli)
[ ] ConvNext
[ ] CTRL
[ ] Data2VecAudio
[ ] Data2VecText
[ ] DeBERTa
[ ] DeBERTa-v2
[ ] DeiT
[ ] DETR
[ ] DistilBERT (@jmwoloso)
[ ] DPR
[ ] ELECTRA (@bhadreshpsavani)
[ ] Encoder
[ ] FairSeq
[ ] FlauBERT
[ ] FNet
[ ] Funnel
[ ] GPT2 (@cakiki)
[ ] GPT-J (@cakiki)
[ ] Hubert
[ ] I-BERT
[ ] ImageGPT
[ ] LayoutLM
[ ] LayoutLMv2
[ ] LED
[x] Longformer (@KMFODA)
[ ] LUKE
[ ] LXMERT
[ ] M2M100
[ ] Marian
[x] MaskFormer (@reichenbch)
[ ] mBART
[ ] MegatronBert
[ ] MobileBERT (@vumichien)
[ ] MPNet
[ ] mT5
[ ] Nystromformer
[ ] OpenAI
[ ] OpenAI
[ ] Pegasus
[ ] Perceiver
[ ] PLBart
[ ] PoolFormer
[ ] ProphetNet
[ ] QDQBert
[ ] RAG
[ ] Realm
[ ] Reformer
[ ] ResNet
[ ] RemBERT
[ ] RetriBERT
[ ] RoBERTa (@patrickvonplaten)
[ ] RoFormer
[ ] SegFormer
[ ] SEW
[ ] SEW-D
[ ] SpeechEncoderDecoder
[ ] Speech2Text
[ ] Speech2Text2
[ ] Splinter
[ ] SqueezeBERT
[ ] Swin (@johko)
[ ] T5 (@MarkusSagen)
[ ] TAPAS
[ ] Transformer-XL (@simonzli)
[ ] TrOCR (@arnaudstiegler)
[ ] UniSpeech
[ ] UniSpeechSat
[ ] Van
[ ] ViLT
[ ] VisionEncoderDecoder
[ ] VisionTextDualEncoder
[ ] VisualBert
[ ] ViT (@johko)
[ ] ViTMAE
[ ] Wav2Vec2
[ ] WavLM
[ ] XGLM
[ ] XLM
[ ] XLM-RoBERTa (@AbinayaM02)
[ ] XLM-RoBERTa-XL
[ ] XLMProphetNet
[ ] XLNet
[ ] YOSO

reichenbch commented 2 years ago

@patrickvonplaten I would like to start with Maskformer for Tensorflow/Pytorch. Catch up with how the event goes.

patrickvonplaten commented 2 years ago

Awesome! Let me know if you have any questions :-)

KMFODA commented 2 years ago

Hello! I'd like to take on Longformer for Tensorflow/Pytorch please.

MarkusSagen commented 2 years ago

@patrickvonplaten I would like to start with T5 for pytorch and tensorflow

patrickvonplaten commented 2 years ago

Sounds great!

patrickvonplaten commented 2 years ago

LayoutLM is also taken as mentioned by a contributor on Discord!

cakiki commented 2 years ago

@patrickvonplaten I would take GPT and GPT-J (TensorFlow editions) if those are still available.

I'm guessing GPT is GPT2?

vumichien commented 2 years ago

I will take Bert, Albert, and Bigbird for both Tensorflow/Pytorch

johko commented 2 years ago

I'll take Swin and ViT for Tensorflow

jmwoloso commented 2 years ago

I'd like DistilBERT for both TF and PT please

ydshieh commented 2 years ago

@patrickvonplaten I would take GPT and GPT-J (TensorFlow editions) if those are still available.

I'm guessing GPT is GPT2?

@cakiki You can go for GPT2 (I updated the name in the test)

ArEnSc commented 2 years ago

Can I try GPT2 and GPTJ for Pytorch? if @ydshieh you are not doing so?

Aanisha commented 2 years ago

I would like to try CLIP for Tensorflow and PyTorch.

NielsRogge commented 2 years ago

I'll take CANINE and TAPAS.

ydshieh commented 2 years ago

Can I try GPT2 and GPTJ for Pytorch? if @ydshieh you are not doing so?

@ArEnSc No, you can work on these 2 models :-) Thank you!

vumichien commented 2 years ago

@ydshieh Since the MobileBertForSequenceClassification is the copy of BertForSequenceClassification, so I think I will do check doc-test of MobileBert as well to overcome the error from make fixup

abdouaziz commented 2 years ago

I'll take FlauBERT and CamemBERT.

ydshieh commented 2 years ago

@abdouaziz Awesome! Do you plan to work on both PyTorch and TensorFlow versions, or only one of them?

Tegzes commented 2 years ago

I would like to work on LUKE model for both TF and PT

NielsRogge commented 2 years ago

@Tegzes you're lucky because there's no LUKE in TF ;) the list above actually just duplicates all models, but many models aren't available yet in TF.

Tegzes commented 2 years ago

In this case, I will also take DeBERTa and DeBERTa-v2 for PyTorch

abdouaziz commented 2 years ago

@ydshieh

I plan to work only with PyTorch

patrickvonplaten commented 2 years ago

@Tegzes you're lucky because there's no LUKE in TF ;) the list above actually just duplicates all models, but many models aren't available yet in TF.

True - sorry I've been lazy at creating this list!

arnaudstiegler commented 2 years ago

Happy to work on TrOCR (pytorch and TF)

patrickvonplaten commented 2 years ago

I take RoBERTa in PT and TF

AbinayaM02 commented 2 years ago

I would like to pick up XLM-RoBERTa in PT and TF.

bhadreshpsavani commented 2 years ago

I can work on ELECTRA for PT and TF

patrickvonplaten commented 2 years ago

Hey guys,

We've just merged the first template for Roberta-like model doc tests: https://github.com/huggingface/transformers/pull/16363 :-) Lots of models like ELETRA, XLM-RoBERTa, DeBERTa, BERT are very similar in spirit, it would be great if you could try to rebase your PR to the change done in https://github.com/huggingface/transformers/pull/16363 . Usually all you need to do is to add the correct {expected_outputs}, {expected_loss} and {checkpoint} to the docstring of each model (ideally giving sensible results :-)) until it passes locally and then the file can be added to the tester :-)

patrickvonplaten commented 2 years ago

Also if you have open PRs and need help, feel free to ping me or @ydshieh and link the PR here so that we can nicely gather everything :-)

patrickvonplaten commented 2 years ago

One of the most difficult tasks here might be to actually find a well-working model. As a tip what you can do:

Find all models of your architecture as it's always stated in the modeling files here: https://github.com/huggingface/transformers/blob/77c5a805366af9f6e8b7a9d4006a3d97b6d139a2/src/transformers/models/roberta/modeling_roberta.py#L67 e.g. for ELECTRA: https://huggingface.co/models?filter=electra
Now click on the task (in left sidebar) your working on, e.g. say you work on ForSequenceClassification of a text model go under this task metric: https://huggingface.co/models?other=electra&pipeline_tag=text-classification&sort=downloads
Finally, click on the framework metric (in left sidebar) you're working with: e.g. for TF: https://huggingface.co/models?library=tf&other=electra&pipeline_tag=text-classification&sort=downloads . If you see too few or too many not well performing models in TF you might also want to think about converting a good PT model to TF under your Hub name and to use this one instead :-)

jeremyadamsfisher commented 2 years ago

I'll take a shot with the PyTorch implementation of CTRL

patrickvonplaten commented 2 years ago

Here the mirror of RoBERTa for Tensorflow: https://github.com/huggingface/transformers/pull/16370

ydshieh commented 2 years ago

Hi, contributors, thank you very much for participating this sprint ❤️.

Here is one tip that might reduce some issues:

Considering the following 2 facts:

A previous file file_utils.py contains some code regarding documentation. It was recently refactorized to different files. It might be a good idea (necessary in some case) to update your working branch in your local clone.
- The file transformers/utils/documentation_tests.txt will be updated frequently by different contributors during this event.

Some testing issues could be resolved as:

git checkout main  # or `master`, depends on your local clone
git fetch upstream
git pull upstream main  # Hugging Face `transformers` renamed the default branch to `main` recently
git checkout your_working_branch_for_this_sprint
git rebase main  # or `master`

Don't hesitate if you encounter any problem. Enjoy~

abdouaziz commented 2 years ago

I take BART and IBERT for PT

zehua99 commented 2 years ago

I'd like to take a crack on Transformer-XL and ConvBert

ydshieh commented 2 years ago

Hi, contributors!

For the model(s) you work with for this sprint, if you could not find any checkpoint for a downstream task, say XXXModelForTokenClassification model, but there is a checkpoint for the base model, what you could do is:

model = XXXModelForTokenClassification.from_pretrained(base_model_checkpoint_name)
model.save_pretrained(local_path)

Then you can upload this new saved checkpoint to Hugging Face Hub, and you can use this uploaded model for the docstring example.

The head part of the model will have randomly initialized weights, and the result is likely to be imperfect, but it is fine for this sprint :-)

ydshieh commented 2 years ago

I'd like to take a crack on Transformer-XL and ConvBert

@simonzli, great :-). Do you plan to work with the PyTorch or TensorFlow version, or both?

zehua99 commented 2 years ago

I'd like to take a crack on Transformer-XL and ConvBert

@simonzli, great :-). Do you plan to work with the PyTorch or TensorFlow version, or both?

I'll work on both PyTorch and TensorFlow😊

AbinayaM02 commented 2 years ago

@patrickvonplaten: I chose XLM-RoBERTa and it's a sub-class of RoBERTa. The comments in the file for both PyTorch and TF suggests that the superclass should be referred for the appropriate documentation alongside usage examples (XLM-RoBERTa documentations shows RoBERTa examples). Should I still be adding examples for XLM-RoBERTa or should I pick some other model?

ydshieh commented 2 years ago

@AbinayaM02 :

Could you show me which line you see suggests that the superclass should be referred for the appropriate documentation in the XLM-RoBERTa model file, please? Thank you :-)

AbinayaM02 commented 2 years ago

@AbinayaM02 :

Could you show me which line you see suggests that the superclass should be referred for the appropriate documentation in the XLM-RoBERTa model file, please? Thank you :-)

Hi @ydshieh: Here are the files https://github.com/huggingface/transformers/blob/main/src/transformers/models/xlm_roberta/modeling_xlm_roberta.py https://github.com/huggingface/transformers/blob/main/src/transformers/models/xlm_roberta/modeling_tf_xlm_roberta.py

Snippet for some classes:

@add_start_docstrings(
    "The bare XLM-RoBERTa Model transformer outputting raw hidden-states without any specific head on top.",
    XLM_ROBERTA_START_DOCSTRING,
)
class XLMRobertaModel(RobertaModel):
    """
    This class overrides [`RobertaModel`]. Please check the superclass for the appropriate documentation alongside
    usage examples.
    """

    config_class = XLMRobertaConfig

@add_start_docstrings(
    "XLM-RoBERTa Model with a `language modeling` head on top for CLM fine-tuning.",
    XLM_ROBERTA_START_DOCSTRING,
)
class XLMRobertaForCausalLM(RobertaForCausalLM):
    """
    This class overrides [`RobertaForCausalLM`]. Please check the superclass for the appropriate documentation
    alongside usage examples.
    """

    config_class = XLMRobertaConfig

ydshieh commented 2 years ago

@AbinayaM02

Thank you. You can leave XLM-RoBERTa as it is.

We should prepare the model list for this sprint in a better way, sorry for this inconvenience.

Would you like to look into another architecture? You can try to check the models whose names are in bold font first, but other models are also welcomed (they might have fewer checkpoints available though).

AbinayaM02 commented 2 years ago

Sure @ydshieh. I'll pick up XLM for both PyTorch and TF then!

ydshieh commented 2 years ago

Hi again, contributors:

In a previous comment, I mentioned uploading a checkpoint with random head - if no checkpoint for a specific model + downstream task could be found on the Hub. After some internal discussion, we think there should be a better approach.

Actually, it would be a good idea to check some checkpoints in hf-internal-testing. In this page, you don't need to check the task type, just check the model architecture.

If you could not find any checkpoint for the model you work with at that page, we encourage you to work with other models for which you could find checkpoints**.

I update the model list to use bold font to indicate those models that are likely to have checkpoints.

Hi, contributors!

For the model(s) you work with for this sprint, if you could not find any checkpoint for a downstream task, say XXXModelForTokenClassification model, but there is a checkpoint for the base model, what you could do is:
model = XXXModelForTokenClassification.from_pretrained(base_model_checkpoint_name)
model.save_pretrained(local_path)
Then you can upload this new saved checkpoint to Hugging Face Hub, and you can use this uploaded model for the docstring example.

The head part of the model will have randomly initialized weights, and the result is likely to be imperfect, but it is fine for this sprint :-)

vumichien commented 2 years ago

@ydshieh It throws this error when I tried to load one of hf-internal-testing.

TypeError                                 Traceback (most recent call last)
[<ipython-input-239-5c839d6ee084>](https://localhost:8080/#) in <module>()
      3 from transformers import AlbertTokenizer, BertTokenizer, BigBirdTokenizer, MobileBertTokenizer
      4 checkpoint = "hf-internal-testing/tiny-random-big_bird"
----> 5 tokenizer = BigBirdTokenizer.from_pretrained(f"{checkpoint}")
      6 model = BigBirdForSequenceClassification.from_pretrained(f"{checkpoint}" )
      7 inputs = tokenizer("Hello, my dog is cute", return_tensors="pt")

4 frames
[/usr/local/lib/python3.7/dist-packages/sentencepiece/__init__.py](https://localhost:8080/#) in LoadFromFile(self, arg)
    169 
    170     def LoadFromFile(self, arg):
--> 171         return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
    172 
    173     def DecodeIdsWithCheck(self, ids):

TypeError: not a string

ydshieh commented 2 years ago

Thank you. I think it is because we didn't upload the necessary tokenizer file. I will talk to the team members. Thank you for spotting this!

changhyeonnam commented 2 years ago

I would like to take XLNet for PT and TF.

jessecambon commented 2 years ago

I'd like to work on DistilBERT for PT and TF with my coworker @jmwoloso

johko commented 2 years ago

I just wanted to start on the doc tests for TF Swin, turns out it doesn't exist, only the PyTorch version. So I suppose that can be seen as done ;)

ydshieh commented 2 years ago

I just wanted to start on the doc tests for TF Swin, turns out it doesn't exist, only the PyTorch version. So I suppose that can be seen as done ;)

Sure, thank you for the feedback :-). We should have better prepared the model list. And thank you for the work on TFViT - I will keep you updated after some discussion with our team!

huggingface / transformers