dair-iitd / openie6

OpenIE6 system
GNU General Public License v3.0
119 stars 36 forks source link

Using a different dataset to train, validate, and test #14

Closed Aatlantise closed 2 years ago

Aatlantise commented 2 years ago

Hello,

I'd like to train, validate, and test openie6 using a different dataset.

I've replaced the training set data/openie4_labels already, but am not sure how to replace validation and test sets?

My thought was to replace carb/data/test.txt and carb/data/dev.txt but noticed they do not contain ARG/REL tags but only NONE tags. I figured they were just being used to trigger inference, and validation and evaluation happens somewhere else.

Thanks in advance!

Aatlantise commented 2 years ago

It seems that carb/data/gold/test.tsv and carb/data/gold/dev.tsv need to be replaced as well. I am guessing that while the .txt files I mentioned above are used to predict, but the output is then compared with the .tsv files to determine performance scores.