Open MichiOnGithub opened 5 years ago
Hi Michael,
Thanks very much for your interests in our project :D
As far as I can see from you example code, you are using the code correctly. But the absolute values do not make much sense. You should use the values to derive the ranking of summaries. In your example case, the model ranks the four summaries as summ3 > summ4 > summ2 > summ1.
A bit more explanations: when we train the model, we push the model to give the correct ranking over each pair: e.g. suppose you have two summaries s1 and s2 for the same doc, and you know from the human ratings that s1 is better than s2, then during our training, we push the model to give higher score to s1 than s2. We have also tried to push the model to reproduce the human ratings (i.e. the regression loss), but that yields worse performance (find details on the paper).
As for using cpu for inference, you may refer to the answer at here: https://discuss.pytorch.org/t/loading-weights-for-cpu-model-while-trained-on-gpu/1032/2 Sorry about the inconveniences caused; we will add code to cover the cpu use case.
Best, Yang
Hi Yang,
thanks for the details, just wanted ask if there are any updates on the regression task, i.e. reproducing the human ratings? I have a use case where I'd need to rate a summary with a normalised value r\in{0,1}.
Best!
Thank you for this great contribution, I'm sure it will help developing RL summarization systems.
One thing I don't understand is how to interpret the values return from the rewarder. I'd assume that higher scores indicate higher-quality summaries. Running a few tests, the values are not what I expected:
Am I using it incorrectly or do I need to apply any kind of preprocessing beforehand? If this is the correct usage, is this just an unfortunate example / out of domain?
Also, when using a cpu for inference, the
torch.load
function inrewarder.py
needs an additional parameter, as it defaults to cuda.Kind Regards, Michael