NJUNLP / TOWE

Code and data for "Target-oriented Opinion Words Extraction with Target-fused Neural Sequence Labeling" (NAACL2019)
MIT License
130 stars 40 forks source link

Dataset Preprocess #7

Open XINGXIAOYU opened 4 years ago

XINGXIAOYU commented 4 years ago

Dear authors,

Could you please share the data preprocess code with us? Because the data format is differenet with the original .xml file.

If you could share this preprocess file, we can use your trained model to do some weakly supervise work.

Thanks a lot.

XINGXIAOYU commented 4 years ago

And also there is no CRF layer in the open source code.

yilifzf commented 4 years ago

Hi, I have no access to the original source code of data-preprocessing and the re-implemented crf layer right now due to the pandemic. Also the souce code needs some efforts for cleaning. The data-preprocessing is not complicated including tokenization with NLTK and tagging for the opinion targets. Also the sentence ids are given for easily corresponding to those in the original .xml files. I don't know if those ids are helpful enough for your work. Feel free to contact me in email (maybe chinese) if you find any specific difficulties I can help right now.