benathi / word2gm

Word to Gaussian Mixture Model
BSD 3-Clause "New" or "Revised" License
283 stars 73 forks source link

Increasing Throughput #5

Closed LongSeanSilvr closed 7 years ago

LongSeanSilvr commented 7 years ago

In running word2gm_trainer.py on the text8 data (via train_text8.sh) I'm getting a throughput of about 6,500 words/sec. This seems a bit low-- is this about what you would expect? Is there something I'm missing that would help to increase the throughput?

benathi commented 7 years ago

No I don't think you're missing anything. The words/sec should be around that. The training time for text8 should be about an hour for 10 epochs or so.

LongSeanSilvr commented 7 years ago

Ok, good to know. Thanks!

unvaiso commented 7 years ago

I find it strange that the gpu memory is highly solicited while the gpu occupancy is poor (the cpu is used at 100% though). Looks like the gpu memory is allocated but not used.

sungjinoh commented 6 years ago

@unvaiso You should check this out. (https://github.com/benathi/word2gm/blob/master/word2gm_trainer.py#L20) If your code contains this line, you have to remove it.

aswinsuresh commented 5 years ago

No I don't think you're missing anything. The words/sec should be around that. The training time for text8 should be about an hour for 10 epochs or so.

Shouldn't it be around 10 hours ? When I am training the number of words per epoch is shown as ~17M