Initially, the "top-10" accuracy rises in the first epoch from initialization, but then goes down, even as loss is going down smoothly.
The top-10 accuracy that I have implemented is very generous: if any of the top-10 closest keyword vectors to a protein is within its annotated set, count that protein as "correct". Then calculate total_correct/total_proteins.
Several things to work on:
Average similarity between proteins and their keyword vectors seem to be going up, as expected, even as accuracy goes down.
Calculate precision at 0.5 recall measure to see whether this is the better metric. (if this goes down, need to think about why the top-10 accuracy isn't a good metric)
Batch size search? Maybe it'd be better with smaller batch sizes for some reason, as I'm using the maximum batch size that the gpu allows at this time (without doing mixed-precision tensors to save memory). But the CLIP paper uses the maximum batch size they can, so I feel that this is not the problem.
Initially, the "top-10" accuracy rises in the first epoch from initialization, but then goes down, even as loss is going down smoothly.
The top-10 accuracy that I have implemented is very generous: if any of the top-10 closest keyword vectors to a protein is within its annotated set, count that protein as "correct". Then calculate total_correct/total_proteins.
Several things to work on: