orenmel / context2vec

Apache License 2.0
216 stars 60 forks source link

CPU training #2

Open matanox opened 7 years ago

matanox commented 7 years ago

Not really an issue, but any advice for CPU-only training? https://groups.google.com/forum/#!topic/chainer/vbkOdKaesPI

orenmel commented 7 years ago

Training should use only the CPU if you use '-g -1'.

matanox commented 7 years ago

Thanks a lot Oren, I should have run with -h to begin with... I am training it over a Wikipedia Hebrew dump, and really not surprisingly, it seems that CPU training with the default number of dimensions and epochs will really take forever even on a strong chipset... just curious whether you have any timings for a GPU run and the GPU model used.

orenmel commented 7 years ago

Here's my experience, as described in the CoNLL paper: "With mini-batches of 1,000 sentences at a time, we started by training our models with a single iteration over the 2-billion-word ukWaC corpus. This took 30 hours, using a single Tesla K80 GPU. For the smaller 50-million-word MSCC learning corpus, a full iteration with a batch size of 100 took only about 3 hours." If you don't have such a GPU, you could rent a GPU machine by the hour from Amazon, Google, etc. For example: http://sla.hatenablog.com/entry/en_chainer_on_ec2