mgrankin / ru_transformers

Apache License 2.0
776 stars 108 forks source link

774M model? #35

Closed fen0s closed 3 years ago

fen0s commented 4 years ago

It's being mentioned in README briefly when setting up configs and env variables, but doesn't seem to be present on server. Was there ever 774M model trained on russian corpus? Quoting from README:

# GPT-2 774M, final perplexity 21.09?

export CUDA_VISIBLE_DEVICES=3
export MODEL_SIZE=gpt2-large
export OUTPUT=output_yt/l
export BS=1
export LR=1e-5
mgrankin commented 4 years ago

Hi, I've been able to start training 774M with GPUs, but didn't completed the training. With TPUs I got OOM error and decided to give it a break.

fen0s commented 4 years ago

Are you planning to continue training it? Sorry for late response.

mgrankin commented 4 years ago

Yes, someday. Not sure when.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.