Hello and thank you so much for curating the data and making this dataset open to the public! I've just found some stats below and wanted to share them with you.
I found the below results from my analysis of the data:
devel has 0/2033 sentences (0.0(%)) already included in train.
test has 0/2974 sentences (0.0(%)) already included in train.
devel has 1317/2033 sentences (64.7811116576488(%)) already included in train_synthetic.
test has 1889/2974 sentences (63.517148621385346(%)) already included in train_synthetic.
I think that including train_synthetic during training and evaluating the system on devel and test could be considered not completely a fair setup.
Hello and thank you so much for curating the data and making this dataset open to the public! I've just found some stats below and wanted to share them with you.
I found the below results from my analysis of the data: devel has 0/2033 sentences (0.0(%)) already included in train. test has 0/2974 sentences (0.0(%)) already included in train. devel has 1317/2033 sentences (64.7811116576488(%)) already included in train_synthetic. test has 1889/2974 sentences (63.517148621385346(%)) already included in train_synthetic.
I think that including train_synthetic during training and evaluating the system on devel and test could be considered not completely a fair setup.