zhiguowang / BiMPM

BiMPM: Bilateral Multi-Perspective Matching for Natural Language Sentences
Apache License 2.0
438 stars 150 forks source link

The use of GPU is not efficient ???? #34

Open xljhtq opened 6 years ago

xljhtq commented 6 years ago

When I trained the model with GPU and the training data is very bigger, I found low utilization rate of GPU, namely about 11% and the utilization rate of CPU is about 110%. I want to know How to increase the utilization rate of GPU? The batch_size cannot be bigger because of the limited memory.

I also want to know What are you like when you train because RNN layers used in the model would slow training speed.

zhiguowang commented 6 years ago

The training time for me is not very slow. On SNLI dataset, it costs 515 seconds for one iteration over the entire training set. And decoding on the dev set can be done in 3 seconds. I'm using K80 GPU.