bigscience-workshop / multilingual-modeling

BLOOM+1: Adapting BLOOM model to support a new unseen language
https://arxiv.org/abs/2212.09535
Apache License 2.0
69 stars 15 forks source link

Adapter Training - Frozen transformer.wpe.weight? #9

Closed yongzx closed 2 years ago

yongzx commented 2 years ago

https://github.com/bigscience-workshop/multilingual-modeling/blob/1a383288d89dafa16e462558a209e33a020ebe85/scripts/madx_exp/madx_lngembft_clm.py#L457

@vnikouliNLE It seems like we are not fine-tuning the positional embedding layers transformer.wpe.weight. I believe we should also train it too?

yongzx commented 2 years ago

Yes we should train it.