baidu / Dialogue

444 stars 97 forks source link

Different between test.txt and score #9

Closed YFwang1992 closed 5 years ago

YFwang1992 commented 5 years ago

Hello Mr,Zhou: There is one thing i found that is different length between test.txt and score in douban In data/douban/test.txt there is 10000L.But when i download models and unzip it ,i found output/douban/DAM/score is 6656L. some operation i miss?

xyzhou-puck commented 5 years ago

@luluxing3

luluxing3 commented 5 years ago

In Douban Conversation Corpus, the size of test.txt is 10k, but there is only 6670 after deleting all negative responses or all positive responses. For more detail, you can check the paper, "Sequential Matching Network: A New Architecture for Multi-turn Response Selection in Retrieval-Based Chatbots", which releases the Douban Conversation Corpus. The output/douban/DAM/score still leave 4 lines, you can change batch_size to the divisor of 6670 to get the whole score file.

YFwang1992 commented 5 years ago

@luluxing3 thanks a lot....