npow / ubottu

Next Utterance Classification
http://arxiv.org/abs/1506.08909
136 stars 45 forks source link

How to preprocessing the dataset #3

Closed pl8787 closed 9 years ago

pl8787 commented 9 years ago

I found that the dataset was preprocessed by replace urls to url and some entity to path, person ... Could you share the code of data preprocessing? @ryan-lowe and @npow

npow commented 9 years ago

The code uses NLTK for NER, and can be found at ryan-lowe/Ubuntu-Dialogue-Generationv2.