DeepRank / 3D-Vac

Personalized cancer vaccine design through 3D modelling boosted geometric learning.
Apache License 2.0
3 stars 0 forks source link

Sequence features #40

Closed DarioMarzella closed 4 months ago

DarioMarzella commented 2 years ago

For the pilot studies (HLA-A02:01 and HLA-DBR101:01) the following sequence features were used: FullPSSM for the MHC one-hot encoding for the peptide

DarioMarzella commented 2 years ago

03/08/2022 meeting decision: For the whole-allelic experiment ( #41 ):

Then, the following sequence features will be used: MHC:

Eventually, anchor 1/0 feature can be added to both MHC and pept.

*Note: ++ or -- for blosum and PAM indicate the matrix value. A pam++ is a high pam (e.g. PAM250), and a pam-- is a low pam (PAM50). **Reminder: A high blosum accounts for sequences with high identity. A low pam accounts for sequences with low mutation events between each other (e.g., high identity like a low blosum). On the other hand, a low blosum accounts for sequences with low conservation, so does a high pam.

gcroci2 commented 2 years ago

Labels for this issue? @DarioMarzella

DarioMarzella commented 2 years ago

I am not the best at picking labels names, but I hope this is clear enough :)

heleensev commented 2 years ago

23/09/2022 meeting decision:

New PSSM generating proposition:

Data: IPD/IMGT + Uniprot + MGnify (maybe?)

Database generation: HHBlits

MSA generation (per allele): HHBlits

PSSM generation (per allele) PSSMGen

PSSM mapping (per model): PSSMGen