Closed narayanacharya6 closed 1 year ago
Hi there.
Yes, when we made SciRepEval, we used the FoS training data in the train split and the gold manual eval in the eval split. Due to some internal chaos at the time of training the original FoS model (and a lack of a publication for it), I am 95% sure the training data is the same (and not 100%).
We are working on a better transformer-based FoS model and the training set there will be silver data labeled by various GPTs. Hopefully that will be released later this year.
Thanks for clarifying! Looking forward to the new models and dataset :)
I noticed that the model published with the repo outputs the same labels as the labels in the
fos
subset in theSciRepEval
dataset published here. Can someone comment if the model was trained using some version (same/subset/superset?) of this dataset?There is another issue that asked about the training data for the model that was closed. This question is only out of curiosity, so feel free to close this issue too if the training data or details on how it was curated cannot be made public yet.