Open wujinting opened 11 months ago
Hello, I just did a fresh download of the data and ran the code of track 1. Because the results per run can vary a lot I did that 3 times and got the following results:
1st Run: Total AUROC: 0.6351, Total AUPRC: 0.4503, Random AVG: 0.5427, (This should be named "total" - will fix) 2nd Run: Total AUROC: 0.6213, Total AUPRC: 0.4427, Random AVG: 0.5320, 3rd Run: Total AUROC: 0.6287, Total AUPRC: 0.4740, Random AVG: 0.5513,
The random PR_AUC is always 3258. Note that we keep the best checkpoint (i.e. the default) in each run. Is there any chance you changed anything in the baseline code ? Or are missing any data ? I attach the folder tree structure you should have after downloading:
Note that to get the final results of each run, you need to run the "test.py" file (and not look at the results at the end of "train.py" which correspond to the final model - we keep the best model). Let me know if you figure it out.
We ran the code according to the given hyper-parameters, but the scores we got were much lower than the given baseline score. For example, in our experiment, the PR-AUC of Track 1 is 0.40 and the ROC-AUC of Track 1 is 0.59. The PR-AUC of random chance is also different. Is the data used for your baseline scores different from the data given in the challenge? Did you use the hyper-parameters that are different from the current code?