taylor-lab / neoantigen-dev

neoantigen prediction from WES/WGS
4 stars 1 forks source link

Loss of HLA/Peptide Pairs #14

Open arichards2564 opened 1 year ago

arichards2564 commented 1 year ago

Roughly 2-3% of peptides are missing one or more HLA allele in the final table. I believe this is due to the lines below. Each peptide can have a different interaction core (icore) depending on the HLA allele. I believe this section finds the best binder for a given peptide but because the annotation is "best_binder_for_icore_group" any peptide/HLA pair that has a different icore from the "best_binder_for_icore_group" is lost. I suggest combining the tables in a different way to avoid loss of rows and creating an extra column so that we have a "best_binder_for_icore_group" and a "best_binder_for_peptide" (so these examples that are being lost with have 1 TRUE in the "best_binder_for_peptide" but have 2 TRUEs in "best_binder_for_icore_group" because there are 2 different icores for that peptide).

https://github.com/taylor-lab/neoantigen-dev/blob/dd12d67c40c6927c075f6d5eed8a2e7df5af37f7/neoantigen.py#L419-L427