Question about the architecture in HuggingFace

wasiahmad / PLBART

Official code of our work, Unified Pre-training for Program Understanding and Generation [NAACL 2021].

https://arxiv.org/abs/2103.06333

MIT License

186 stars 35 forks source link

Question about the architecture in HuggingFace #51

Closed Runingtime closed 1 year ago

Runingtime commented 1 year ago

I noticed that the model architecture from HuggingFace has a shared embedding layer (shared): Embedding(50005, 768, padding_idx=1) whereas Fairseq (used in this repo) does not. Will the shared embedding affect the performance for code-to-text tasks?

Thanks!