training & validation details

Hi, the validation is done for development models (SupCon-Hard CLEAN models with access to 5-fold train/validation data). You can do the standard cross-validation fold splits for yourself, or email us for the original data splits. The validation evaluation steps run the two EC-calling methods and compare the predicted EC numbers with the ground truth.

We got the Epoch numbers base on how the model performs after k epochs. Because our computing resources are limited, those numbers should be taken as reference only (it's very likely that if you run more epochs you can have some slight improvement over the metrics reported in our paper). Please note that the SupCon-Hard loss receives over a dozen examples whereas the Triplet loss receives just three examples for one forward pass, it is much slower per epoch compared to the Triplet loss. In fact, we only trained SupCon-Hard CLEAN models with a 70% split, but a 100% split version should be doable if you have an A100 GPU for several days.

tttianhao / CLEAN

training & validation details #28