princeton-nlp / PURE

[NAACL 2021] A Frustratingly Easy Approach for Entity and Relation Extraction https://arxiv.org/abs/2010.12812
MIT License
790 stars 122 forks source link

tool to create training data #22

Closed AlanQuille closed 3 years ago

AlanQuille commented 3 years ago

Hello, I was wondering, how do you create the custom training data in JSONL format? Do you do it manually or do you have access to a tool? Thank you very much.

a3616001 commented 3 years ago

Hi Alan,

For ACE04/ACE05, we use the DYGIE's script to convert the raw data in JSONL format. For SciERC, the processed JSONL file can be downloaded in their website.

For other datasets, it should be easy to write a python script and dump the dataset in a JSONL file (e.g., using json.dump).

Best, Zexuan

AlanQuille commented 3 years ago

Thank you very much for your prompt reply.