Closed DarioMarzella closed 4 months ago
03/08/2022 meeting decision: For the whole-allelic experiment ( #41 ):
Then, the following sequence features will be used: MHC:
Eventually, anchor 1/0 feature can be added to both MHC and pept.
*Note: ++ or -- for blosum and PAM indicate the matrix value. A pam++ is a high pam (e.g. PAM250), and a pam-- is a low pam (PAM50). **Reminder: A high blosum accounts for sequences with high identity. A low pam accounts for sequences with low mutation events between each other (e.g., high identity like a low blosum). On the other hand, a low blosum accounts for sequences with low conservation, so does a high pam.
Labels for this issue? @DarioMarzella
I am not the best at picking labels names, but I hope this is clear enough :)
23/09/2022 meeting decision:
New PSSM generating proposition:
Data: IPD/IMGT + Uniprot + MGnify (maybe?)
Database generation: HHBlits
MSA generation (per allele): HHBlits
PSSM generation (per allele) PSSMGen
PSSM mapping (per model): PSSMGen
For the pilot studies (HLA-A02:01 and HLA-DBR101:01) the following sequence features were used: FullPSSM for the MHC one-hot encoding for the peptide