Closed vr25 closed 4 years ago
"Do you think it would be because of different PyTorch or TensorFlow versions?"
=>> I dont think so.
It's strange that you have only 17 training samples and using a max-sentences hyper-parameter of 2.
I am not sure I am the right person to response your issue. You might want to post your issue to fairseq
.
Hi,
I followed the instructions at https://github.com/pytorch/fairseq/blob/master/examples/roberta/README.pretraining.md
But I am running into this error:
I am using the wikitext-103-raw. Do you think it would be because of different PyTorch or TensorFlow versions?
I am using this configuration: 4x NVIDIA Tesla V100 GPUs with 16 GiB of memory.
Thanks, again!
Originally posted by @vr25 in https://github.com/VinAIResearch/PhoBERT/issues/3#issuecomment-605117835