Speed up inference - Githubissues

pender / chatbot-rnn

A toy chatbot powered by deep learning and trained on data from Reddit

MIT License

900 stars 371 forks source link

Yes, TPU would probably be much faster. If you really know what you're doing, you could try replacing my beam search implementation with one that happens entirely on the GPU, which could make it run much faster with no degradation in quality -- but that's beyond my knowhow. Otherwise I'd just recommend playing around with the inference options I've included -- for example, beam width of 1 and topn of 5 might be worth trying. That will degrade quality somewhat, but probably any deviation from the default options will degrade quality, because I picked the default options to maximize quality :)

pender / chatbot-rnn

Speed up inference #50