clinicalml / TabLLM

MIT License
265 stars 42 forks source link

The scores after every 30 epochs doesn't change at all during fine-tuning #17

Open tanayshah23 opened 10 months ago

tanayshah23 commented 10 months ago

Hi,

I am trying to fine-tune the t03b model with a 4-shot approach on the heart dataset. On checking the logs after every 30 epochs, I don't see any difference in the scores that are printed. It prints the same score after the first 30 epochs and the same at the end of 5600 epochs

{"AUC": 0.5823586744639375, "PR": 0.6555849741948994, "micro_f1": 0.5706521739130435, "macro_f1": 0.4795001253267447, "accuracy": 0.5706521739130435, "num": 184, "num_steps": -1, "score_gt": 0.3382242697736491, "score_cand": 0.37344987593267276}
....
....
....
{"AUC": 0.5823586744639375, "PR": 0.6555849741948994, "micro_f1": 0.5706521739130435, "macro_f1": 0.4795001253267447, "accuracy": 0.5706521739130435, "num": 184, "num_steps": 30, "score_gt": 0.3382242697736491, "score_cand": 0.37344987593267276}

I also tried the same thing on a different dataset and got the same results. Here's the screenshot:

image

Can you please tell me what am I doing wrong?

stefanhgm commented 10 months ago

Hello @tanayshah23 ,

as explained in the issue #16, we treat the test data as validation set in the code, since we don't do parameter tuning. Your output shows the validation performance after every 30th epoch, hence, the test performance after evert 30th epoch. I suspect that based on 4 examples the model is not learning anything anymore after the 30th epoch so that the test performance does not change.

I hope that helps!