Closed IvanAntipov closed 3 years ago
Same here.
I've exactly the same with RuGPT-3 Medium. @IvanAntipov what configuration are you using (cuda, torch, triton)?
I added a new parameter
--make-vocab-size-divisible-by 1
And it works.
I've exactly the same with RuGPT-3 Medium. @IvanAntipov what configuration are you using (cuda, torch, triton)?
I think it is not actual anymore, but nevertheless:
CUDA 11.2
torch==1.5.0
No triton
@MolchanovArt Thank you kindly. I manged to pass it, finally.
But stumped upon that pesky RuntimeError: CUDA: Error- invalid ptx (https://github.com/sberbank-ai/ru-gpts/issues/62) in Colab, even with medium model. But I'm trying to use cpu_offload
to fit large model in GPU RAM.
Just for the record i have
print(torch.__version__)
!/usr/local/cuda/bin/nvcc --version | grep cuda
torch: 1.7.0+cu110 Cuda: Build cuda_11.0_bu.TC445_37.28845127_0 triton: 0.2.3
I try to reproduce finetuning process for
rugpt3large
with deepspeed and apex.I managed to finetune
rugpt3small
.But when a run the same script with
large
configuration a get the following errorMy configuration
I tried different transformers versions transformers==3.5.0, transformers==4.3.0, but result is the same
P.S. My apex installation slightly differs from one in
Finetune_and_generate_RuGPTs_deepspeed_megatron.ipynb
example, because I had to install it with Nidia container, in other case it didn't work.