Closed sm354 closed 2 years ago
Hi, sorry for the delayed response. I can update the instructions (along with uploading any additional scripts, if there were any) tomorrow (Friday).
(I'll also try to respond to #7 tomorrow, too.)
I was missing a couple files. I've included them in this branch, along with some additional instructions in domain/README.md. It turns out the semeval file format is very similar to OntoNotes (maybe a couple column indices were a bit different?) so the files are preprocessing files are largely the same.
However, I wouldn't be surprised if there are several minutes of hacking needed with the minimize scripts to properly run on all the data files. Still, this should be a starting point and once I confirm this works for semeval, I'll merge the PR into main.
Thanks for adding the required steps and files for SemEval. I didn't face any issues in converting to jsonlines for ca, it, es, and nl languages.
The README.md has detailed steps on how to convert OntoNotes to jsonlines. Could you please provide the steps for SemEval as well?