Closed nrafaili closed 7 months ago
Hi!
We set all fitness values to "1.0" because there is no need to use the fitness
variable for ClinVar dataset. We adopt Spearman's ρ as evaluation metric for ProteinGym and global AUC for ClinVar. So we only have to record the evolutionary index of each mutation for AUC calulation. Therefore we just randomly set a default value to all fitness values, i.e. 1.0.
You could calculate the AUC value through the below code:
# Evaluate the zero-shot performance of SaProt on the ClinVar benchmark
python scripts/mutation_zeroshot.py -c config/ClinVar/saprot.yaml
python scripts/compute_clinvar_auc.py -c config/ClinVar/saprot.yaml
Awesome, thank you for the prompt response !
Hello, I downloaded the ClinVar .tar.gz file from your directory. I noticed that all of the fitness values are '1.0'. The ProteinGym dataset reports various fitness values. Is there a reason you have only kept the '1.0' fitness score ones ?