JuneFeng / RelationClassification-RL

Reinforcement Learning for Relation Classification from Noisy Data(AAAI2018)
134 stars 37 forks source link

a doubt for the idea #11

Open mikezhang95 opened 5 years ago

mikezhang95 commented 5 years ago

for the special reward setting in this work, better policy will select the sentences in the bag that has higher logP(r|xi), the best result is find the max one, which means finding one max sentence for each bag and feed it to train the classifier. Is that correct?

shanry commented 4 years ago

I got the same doubt, too. These days you can not believe most of the published papers .