allenai / bilm-tf

Tensorflow implementation of contextualized word representations from bi-directional language models
Apache License 2.0
1.62k stars 452 forks source link

RAM required to train original model #173

Closed DomHudson closed 5 years ago

DomHudson commented 5 years ago

Hi, thanks for the repository!

In the README, it's mentioned that the original "ELMo model was trained on 3 GPUs". I was wondering what the RAM usage was during initial training? I'm interested in this as I want to get an idea of how large of a server I should provision to train a similar sized model.

Many thanks! Dom

EDIT: I'd actually be interested both in the memory used on the host machine but also the total GPU memory used

DomHudson commented 5 years ago

Although not an explicit answer, I can continue training the released model with a GPU with 8GB of RAM. Tensorflow warns that a GPU with more memory may be faster but the model still trains. I don't get this warning with a GPU with 12GB of RAM.