Closed hipoglucido closed 7 years ago
the generation code is extremely un optimized. It re-runs the entire RNN for every step. I hope you can optimize it and then contribute to this project by putting the code on github.
k is number of beams and the more beams you have the better the chance to find the best one.
The article for which the headline is generated can be longer than my maximal length (maxlend) and I want to guess where is the best place to start by trying different start points with skip words apart
Ok. For now I am looking for 'for's that I can parallelize with the multiprocess
package. I have found that keras_rnn_predict
takes the longer to run but I think there's not much I can parallelize there. On the other hand, maybe it would be a good idea to parallelize for s in skips:
.
What do you mean by "It re-runs the entire RNN for every step."? Do you mean inside beamsearch
?
I would like to contribute but I am still understanding the code. I think this project is great and it would be great for me if I can make the predictions faster (maybe using multiple CPUs).
Thanks
I mean the timestep inside the RNN process. This is the second index of the 2D data going into the RNN: (batch_size, max_num_steps) and in the 3D output: (batch_size, max_num_steps, vocabulary_size)
@udibr hi, i'm confused about the function gensamples, what role does this function play? thank you.
gensamples is used to generate headline samples. In train.ipynb it is used to generate few samples after each epoch to see if the model indeed learns how to generate new headlines, this is what we are really interested in. The cross-entropy loss is just a simple replacement we can use in training but what we are really after is high quality headline generation.
predict.ipynb uses gensamples to generate new headlines using a pre-trained model
@udibr thank you. I found a very little bug in gensamples(): i think this line"print 'HEAD:',' '.join(idx2word[w] for w in Y_test[i])[:maxlenh]"shoule be "print 'HEAD:',' '.join(idx2word[w] for w in Y_test[i][:maxlenh])",do you think so? I don't have permission to push my branch to this repository, so just tell you here.
looks like a bug :-) usually on github you fix a bug by forking the project, pushing your fix to your own fork and then generating a pull-request (PR)
Hello @udibr , I was wondering how could I optimize speed of predictions because I don't have GPU to run them. I am trying to understand
gensamples
andbeamsearch
:k
tend to provide better results of predictions?skips
parameter for?Thanks!