Open lingxitong opened 1 week ago
Hi @lingxitong - We simply fit a logistic regression model via L-BFGS on the entire train set using a adaptable cost (w.r.t. to the embedding dimension size). We do not divide into train / validation sets (no hyper-parameter tuning or model selection).
Nice Work! I have some questions about the dataset split of the downstream task.For the train-test split (e.g., CRC-100K dataset), my understanding is that you train for one epoch on the training set, then test once on the test set, and finally report the best result. Alternatively, within the train split, you further divide it into training and validation sets, then use the validation set to select the best model before testing on the test set.Can you tell me you conduct which one?I will follow your answer to process my dataset the same way.Thanks!