Open patrickvonplaten opened 2 years ago
@patrickvonplaten If I can work on this and contribute, do let me know. Meanwhile, I will proceed to read and understand the paper.
Hey @reichenbch feel free to work on this if you are interested. Patrick is on vacation this week so I would be happy to help with this :)
@patil-suraj I was thinking first to read the paper once and then look into the available implementation (they are using fairseq library) and checkpoints. Is that the correct approach ? Secondly, what all things would I need for this ? Any model creation guide available ? I know model templates are available in the repo.
For implementing the model I would suggest a code-first approach. My approach is to
Here are some docs that might help when adding a new model :slightly_smiling_face:
Most seq2seq models in fairseq
are similar to bart/mbart, so I would suggest to refer to those models and use the transformers-cli add-new-model-like
command which can create a bart/mbart like template.
Hope this helps!
Hey @reichenbch how's it going ? Let us know if you need any help :)
Hey @patil-suraj Work is in progress, I had the misfortune to witness some health issues sometime back. I will update the files and try to get back on track
Hope you are feeling okay now! And no rush, just wanted to check-in. Take your time 🤗
Hey @patrickvonplaten @patil-suraj @reichenbch i am interested to work on this issue, let me know if I can contribute.
@inderpreetsingh01, feel free to open a PR if you want :-)
@inderpreetsingh01 @patrickvonplaten is anyone actively working on this issue. I was wondering if either I could take it up or shadow someone working on it. I'd like to start learning how to contribute models to huggingface.
@pramodith , I started working on it but got occupied with some personal things, I had gone through the resources shared what I understood from the paper:
EdgeFormer uses:
I am not clear on the layer adaptation part and couldn’t find any parameter related to that in fairseq or edge_architecture function used to define the model.
Let me know if you want to discuss and work on it.
@inderpreetsingh01 I believe that for the layer adaptation technique new parameters are only required for the LoRA method. The parameters for this are defined in this file in the fairseq repository. The file also contains the code for the Interleaved decoder.
If you're busy with other things I can definitely have a go at adding this model to the huggingface repo.
@pramodith thanks for clearing it, i actually looked at the original fairseq repository which is not having the adaptation part. I can contribute on this, we can connect here
Let me know if you need any help :-)
Hey @patrickvonplaten, I wanted to start porting the edgeformer model into the transformers library so I used the transformers-cli add-new-model-like
command, however one of the questions that follows is Please give a checkpoint identifier (on the model Hub) for this new model.
does this mean that I need to upload the pretrained weights file to the Huggingface hub?
Hey @pramodith,
It means that you should specific the checkpoint name that you intend to you when uploading the weights to the Hub.
Hi @patrickvonplaten, @pramodith Is this issue still open? I would like to contribute but I don't see a related PR.
🌟 New model addition
Model description
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation
EdgeFormer: A Parameter-Efficient Transformer for On-Device Seq2seq Generation. Tao Ge and Furu Wei
March 2022: release code and pretrained checkpoints.
Open source status
Happy to help with a model contribution here!