lizekang / ITDD

The source code of our ACL2019 paper "Incremental Transformer with Deliberation Decoder for Document Grounded Conversations "
MIT License
86 stars 17 forks source link

Could your publish your code for the processed data? #3

Closed ChuanMeng closed 5 years ago

ChuanMeng commented 5 years ago

I would appreciate it if you could share the code for the the processed data, thanks!

Or could you explain how you do that? For example, what NLP tool did you use? Spacy or nltk?; what is the meaning of "\"", "\< SEP \>" in kowledge and context? and so on. I hope you can give a detailed explanation.

lizekang commented 5 years ago

We use the NLP tools in the OpenNMT-py. You can see the instructions in http://opennmt.net/OpenNMT-py/extended.html. We use tools/tokenizer.perl as tokenizer. """ will be """ after tokenizing and "" will be "< SEP >". And we use to separate multi-turn context.

ChuanMeng commented 5 years ago

Thank you. I understand the function of "< SEP >". But why you convert """ to "\"", beacuse the tokenizer.perl?

lizekang commented 5 years ago

Yes, tokenizer.perl convert """ to """.