ntunlp / daga

Data Augmentation with a Generation Approach for Low-resource Tagging Tasks
MIT License
79 stars 15 forks source link

How to use the linearized file generated by the trained lstm-lm? #17

Open TeTeTang opened 2 years ago

TeTeTang commented 2 years ago

Hi, I got a question as to how to use the generated linearized file. I have successfully generated the linearized file using the lstm-lm model trained with CoNLL 2003. I will be following the post-processing steps to filter out some unqualified entries and utilizing the line2cols.py script to convert it back to the original two-column format. Not quite clear about how to proceed after that? My intuition was to merge the generated file with the CoNLL training set and use the BiLSTM-CRF model to evaluate the data augmentation performance? Please advise, really appreciate it!