openvax / mhctools

Python interface to running command-line and web-based MHC binding predictors
Apache License 2.0
89 stars 22 forks source link

NetMHCIIpan v3.0 support #54

Closed jfeala closed 7 years ago

jfeala commented 8 years ago

NetMHCIIpan v3.0 has several differences in its interface that require a separate class and output parser:

Thanks! Example inputs and outputs below.

Input fasta:

>seq1
MSNRDFLEGVSGATWVDLVLEG
>seq2
ECDTINCERYNGQVCGGPGRGL

Output (length=15)

# NetMHCIIpan version 3.0

# Input is in FASTA format

# Peptide length 15

# Threshold for Strong binding peptides (IC50)  50.000 nM
# Threshold for Weak binding peptides (IC50)    500.000 nM

# Threshold for Strong binding peptides (%Rank) 0.5%
# Threshold for Weak binding peptides (%Rank)   2%

# Allele: DRB1*01:01
-------------------------------------------------------------------------------------------------------------------------------------------
   pos                 Allele           peptide    Identity  Pos         Core 1-log50k(aff)  Affinity(nM)  %Rank   BindingLevel
-------------------------------------------------------------------------------------------------------------------------------------------
     0             DRB1*01:01   MSNRDFLEGVSGATW    Sequence    5    FLEGVSGAT         0.501        220.80  50.00  <=WB
     1             DRB1*01:01   SNRDFLEGVSGATWV    Sequence    4    FLEGVSGAT         0.566        109.04  32.00  <=WB
     2             DRB1*01:01   NRDFLEGVSGATWVD    Sequence    3    FLEGVSGAT         0.579         95.16  32.00  <=WB
     3             DRB1*01:01   RDFLEGVSGATWVDL    Sequence    2    FLEGVSGAT         0.562        114.92  32.00  <=WB
     4             DRB1*01:01   DFLEGVSGATWVDLV    Sequence    1    FLEGVSGAT         0.485        262.95  50.00  <=WB
     5             DRB1*01:01   FLEGVSGATWVDLVL    Sequence    0    FLEGVSGAT         0.447        395.84  50.00  <=WB
     6             DRB1*01:01   LEGVSGATWVDLVLE    Sequence    0    LEGVSGATW         0.315       1648.11  50.00
     7             DRB1*01:01   EGVSGATWVDLVLEG    Sequence    2    VSGATWVDL         0.251       3296.68  50.00
-------------------------------------------------------------------------------------------------------------------------------------------
Number of strong binders: 0 Number of weak binders: 6
-------------------------------------------------------------------------------------------------------------------------------------------

# Allele: DRB1*01:01
-------------------------------------------------------------------------------------------------------------------------------------------
   pos                 Allele           peptide    Identity  Pos         Core 1-log50k(aff)  Affinity(nM)  %Rank   BindingLevel
-------------------------------------------------------------------------------------------------------------------------------------------
     0             DRB1*01:01   ECDTINCERYNGQVC    Sequence    4    INCERYNGQ         0.217       4774.39  50.00
     1             DRB1*01:01   CDTINCERYNGQVCG    Sequence    3    INCERYNGQ         0.228       4261.79  50.00
     2             DRB1*01:01   DTINCERYNGQVCGG    Sequence    4    CERYNGQVC         0.228       4247.85  50.00
     3             DRB1*01:01   TINCERYNGQVCGGP    Sequence    3    CERYNGQVC         0.238       3806.87  50.00
     4             DRB1*01:01   INCERYNGQVCGGPG    Sequence    2    CERYNGQVC         0.228       4256.68  50.00
     5             DRB1*01:01   NCERYNGQVCGGPGR    Sequence    4    YNGQVCGGP         0.216       4829.72  50.00
     6             DRB1*01:01   CERYNGQVCGGPGRG    Sequence    3    YNGQVCGGP         0.215       4883.42  50.00
     7             DRB1*01:01   ERYNGQVCGGPGRGL    Sequence    6    VCGGPGRGL         0.275       2558.96  50.00
-------------------------------------------------------------------------------------------------------------------------------------------
Number of strong binders: 0 Number of weak binders: 0
-------------------------------------------------------------------------------------------------------------------------------------------
iskandr commented 8 years ago

Hey, @tavinathanson, I still haven't gotten to this. Any chance you want to try your hand at a fix?

iskandr commented 8 years ago

A confusing wrinkle, it looks like the version of NetMHCIIpan that we're using claims to be 3.0 and supports the -list command.

$ netMHCIIpan -h

Usage: ./NetMHCIIpan-3.0.pl [-h] [args] -f [fastafile/peptidefile]
Command line options:
 netMHCIIpan -list | head
DRB1_0101
DRB1_0102
DRB1_0103
DRB1_0104
DRB1_0105
DRB1_0106
DRB1_0107
DRB1_0108
DRB1_0109
DRB1_0110
tavinathanson commented 8 years ago

Another wrinkle: NetMHCIIpan 3.1 also supports -list; strange--

iskandr commented 8 years ago

@jfeala, what happens when you run netMHCIIpan -h?

Lucianabarros commented 7 years ago

Hello,

I'm running netMHCpan but it takes too long. There is a way to use threads or someway to parallelize the analysis?

Thank you, Luciana