Open SEONHOK opened 6 months ago
Hi @SEONHOK , infact I don't see any test labels on huggingface. Were you able to figure this out?
@SEONHOK @sborse3 Thank you for your interest in our work!
The results in the paper are from the test set. But this test set differs from the test part of the original dataset from Huggingface. We partition the dataset as follows:
For small datasets (n_samples < 10K), we divide validation set to half, use one half as test set and one half as validation set. For larger datasets (n_samples > 10K), we divide training set into 1K as validation and the rest as training set, keeping the original validation set as the test set. You can find the specific implementation in the get
function within the SoRA/src/processor.py
file (Lines 87-106).
Hi! I have a question about the evaluation from CoLA using a test data set. The test data of Cola does not have labels. Then, how to evaluate the trained model using CoLa data set?
Thank you!