Closed jjerphan closed 6 years ago
We need to evaluate the final metric of the matching, that is the average of good predictions for proteins.
A prediction for a protein is a set of 10 ligands that are predict to have high chances to bind with it binding. A good prediction is a prediction that contains the actual correct binding ligand.
Some advanced matching are possible but we may not have time to develop them. Closing it for now.
As we have to submit a list of tens binding ligands for each protein, we need to find a way to match them. Several strategies can be used, this issue is to tracked the design of such strategies.
The first approach would be to return, for each protein, the 10 ligands with the highest probability. However, we know that there is an extra constraint, more precisely that there is a one to one correspondence. Hence, we should or must take decisions for ligands generally and not per protein as we could choose a ligand for a lot of different protein several protein with an high confidence.
If we are given
n_p
proteins andn_l
ligands to test :n_l
ligands and take the 10 best ones.n_p* n_l
systems and then take, for each ligands that are chosen several times, the associated protein of highest confidence.