marian-nmt / marian

Fast Neural Machine Translation in C++
https://marian-nmt.github.io
Other
1.22k stars 228 forks source link

Suggestions for GPUs #337

Open hobodrifterdavid opened 4 years ago

hobodrifterdavid commented 4 years ago

Hello. Me and a friend are experimenting with using Marian to translate movie subtitles for language learners. Initially we are trying to better filter the opensubtitles corpus, and train a single es->en model on a RTX2060 with 6GB. If it goes well, we'd maybe like to get a faster setup. How do you recommend to spend a budget of $1500 - $2000 USD on GPUs, to run Marian for training? (probably we'd look for deals on eBay).

1x TITAN RTX 24GB 2x RTX2080Ti 11GB 3x RTX2080 8GB 4x RTX2070 8GB (or perhaps older GTX1080Ti 11Gb?)

I guess one of the top two, as I read RAM is critical, and can benefit from FP16? Thank you for the excellent software.

David

kpu commented 4 years ago

If you intend to parallelize over GPUs, aim for a power of 2. 1 is a power of 2.

emjotde commented 4 years ago

I would say two GPUs are preferable to one. With synchronous SGD the RAM in the two cards basically adds up in terms of batch size (not model size though) while doubling the speed. 8 GB is a bit small, and you will likely not get the full benefit from having four GPUs without good interconnect. I don't know those chips very well, so someone should comment on the relative benefits of the specific cards.

hobodrifterdavid commented 4 years ago

Further info: we've got an old Fujitsu TX300 S7 Xeon E5 (~sandy bridge) server that cost ~$200.. had to cut a hole in the side of the case to get the GPU in, but no problem. It has two 16x slots, and two more 8x slots that can be used for GPUs too. It supports 4x (inexpensive) power supplies, and ram is $1/Gb. Very pleased with the machine, but, needs to be in a room without people, servers are noisey. Happy to give more info on this if it's of interest.

hobodrifterdavid commented 4 years ago

I'm training with this, from the sentencePiece example:

$MARIAN/build/marian \ --devices $GPUS \ --type s2s \ --model model/model.npz \ --train-sets data/corpus.es data/corpus.en \ --vocabs model/vocab.esen.spm model/vocab.esen.spm \ --dim-vocabs 32000 32000 \ --mini-batch-fit -w 4000 \ --layer-normalization --tied-embeddings-all \ --dropout-rnn 0.2 --dropout-src 0.1 --dropout-trg 0.1 \ --early-stopping 5 --max-length 100 \ --valid-freq 10000 --save-freq 10000 --disp-freq 1000 \ --cost-type ce-mean-words --valid-metrics ce-mean-words bleu-detok \ --valid-sets data/subs-dev.es data/subs-dev.en \ --log model/train.log --valid-log model/valid.log --tempdir model \ --overwrite --keep-best \ --seed 1111 --exponential-smoothing \ --normalize=0.6 --beam-size=6 --quiet-translation

I see the marian process is using one core, stuck at 100% mostly. It looks a bit like Marian is limited by the single thread speed.. unless the CPU is just polling something or collecting some stats in a loop. I'll try the dev branch next I guess.

hobodrifterdavid commented 4 years ago

Initial test was very promising. I hope to become more familiar with the code and perhaps even contribute something to the project at some point.

adjouama commented 4 years ago

Did you check your GPUs usage ? watch -n1 nvidia-smi

I think you have one GPU running that's why you see one cpu core at 100% (from my own experience).