Closed zhan8855 closed 2 months ago
I fixed this by replacing
with
for i, x in enumerate(sorted(base_path.iterdir())):
I am not sure I follow what issue the fix you are using is resolving to be honest.
In any case, you can also try playing with LR and batch size. I found these tasks to quite finicky and sometimes sensitive to those parameters with some runs leading to NaN values which could explain the low test loss
The issue is that the genomics benchmark datasets have sub-directories "positive" and "negative" for both "train" and "test" splits, and it could happen that base_path.iterdir() returns ["negative", "positive"] when loading the "train" split and while returning ["positive", "negative"] when loading the "test" split. In that case, the test labels are not aligned with the train and eval labels.
I see. Nice find. I wonder why I didn't hit this issue.
Hi, thank you for your awesome work!
I would greatly appreciate it if you could help me check the following command and logs when I try to reproduce the experiments on dummy_mouse_enhancers_ensembl. The validation results are fine. However, I am getting an unexpected low test accuracy.
Thank you very much in advance!
Command
Log