microsoft / MASS

MASS: Masked Sequence to Sequence Pre-training for Language Generation
https://arxiv.org/pdf/1905.02450.pdf
Other
1.11k stars 206 forks source link

unable to set a proper batch_size in MASS-supNMT pretraining #142

Closed vikrant97 closed 4 years ago

vikrant97 commented 4 years ago

@StillKeepTry I am trying to pretrain a model using the instructions given in MASS-supNMT directory. But I am getting the following error. I have tried changing the batch_size upto 4096 but then it exceeds the GPU memory limit. Any workaround here to skip the sentences whose size is larger thatn max-token size?

mass_error

vikrant97 commented 4 years ago

I manually removed the sentences of certain length greater than 50 and then set the max-token size to 1024 and it worked. But still a workaround in the code itself where the sentences can be ignored would be better. I guess that is included in fairseq's latest version.

StillKeepTry commented 4 years ago

Sometimes, we suggest filtering the sentences which length > 250. since the longer sentence will also consume more GPU memory.