Open redthing1 opened 3 years ago
That's weird. Are you changing any other training settings?
That's weird. Are you changing any other training settings?
everything else is defaults. i tried again using a fresh copy of your notebook and 125M now works, but350M still OOMs.
Having this issue too. If it matters, I'm using a pretty large text file (~20 MB) as the dataset, and I'm also getting this warning a short while after training starts:
Token indices sequence length is longer than the specified maximum sequence length for this model (2385 > 2048). Running this sequence through the model will result in indexing errors
This also happened in my attempts to train GPT-Neo locally, so it doesn't seem like it's endemic to Colab.
Having this issue too. If it matters, I'm using a pretty large text file (~20 MB) as the dataset, and I'm also getting this warning a short while after training starts:
Token indices sequence length is longer than the specified maximum sequence length for this model (2385 > 2048). Running this sequence through the model will result in indexing errors
This also happened in my attempts to train GPT-Neo locally, so it doesn't seem like it's endemic to Colab.
That error just looks like one of your training samples has a token count that is too large, not the same as a GPU OOM. I recommend using the tokenizer function to find whatever sequence is causing that.
Alright, I'll check that out, but I am also definitely OOMing
Using Colab: I get OOM Finetuning GPT-Neo both 125M and 350M on both T4 and P100. Even when I enable
fp16
this problem persists. GPT2 works fine on the other hand.