TRON-Bioinformatics / neofox

Annotation of mutated peptide sequences with published or novel potential neoantigen descriptors
GNU General Public License v3.0
25 stars 6 forks source link
bioinformatics cancer-immunology immunogenicity-prediction immunogenomics neoantigen-annotation neoantigens netmhciipan netmhcpan

NeoFox - NEOantigen Feature tOolboX

DOI PyPI version Anaconda-Server Badge Documentation Status

NeoFox annotates neoantigen candidate sequences with published neoantigen features.

For a detailed documentation, please check out https://neofox.readthedocs.io

If you use NeoFox, please cite the following publication:
Franziska Lang, Pablo Riesgo-Ferreiro, Martin Löwer, Ugur Sahin, Barbara Schrörs, NeoFox: annotating neoantigen candidates with neoantigen features, Bioinformatics, Volume 37, Issue 22, 15 November 2021, Pages 4246–4247, https://doi.org/10.1093/bioinformatics/btab344

Table of Contents

1 Implemented neoantigen features
2 NeoFox requirements
3 Usage from the command line
4 Input data
5 Output data

1 Implemented Neoantigen Features

NeoFox covers the following neoantigen features and prediction algorithms:

Name Reference DOI
MHC I binding affinity/rank score (netMHCpan-v4.1) Reynisson et al, 2020, Nucleic Acids Research https://doi.org/10.4049/jimmunol.1700893
MHC II binding affinity/rank score (netMHCIIpan-v4.3) Nilsson et al, 2023, Science Adv. https://doi.org/10.1126/sciadv.adj6367
MixMHCpred score v2.2 § Bassani-Sternberg et al., 2017, PLoS Comp Bio; Gfeller, 2018, J Immunol. https://doi.org/10.1371/journal.pcbi.1005725 , https://doi.org/10.4049/jimmunol.1800914
MixMHC2pred score v2.0.2 § Racle et al, 2019, Nat. Biotech. 2019 https://doi.org/10.1038/s41587-019-0289-6
Differential Agretopicity Index (DAI) Duan et al, 2014, JEM; Ghorani et al., 2018, Ann Oncol. https://doi.org/10.1084/jem.20141308
Self-Similarity Bjerregaard et al, 2017, Front Immunol. https://doi.org/10.3389/fimmu.2017.01566
IEDB immunogenicity Calis et al, 2013, PLoS Comput Biol. https://doi.org/10.1371/journal.pcbi.1003266
Neoantigen dissimilarity Richman et al, 2019, Cell Systems https://doi.org/10.1016/j.cels.2019.08.009
PHBR-I § Marty et al, 2017, Cell https://doi.org/10.1016/j.cell.2017.09.050
PHBR-II § Marty Pyke et al, 2018, Cell https://doi.org/10.1016/j.cell.2018.08.048
Generator rate Rech et al, 2018, Cancer Immunology Research https://doi.org/10.1158/2326-6066.CIR-17-0559
Recognition potential § Łuksza et al, 2017, Nature; Balachandran et al, 2017, Nature https://doi.org/10.1038/nature24473 , https://doi.org/10.1038/nature24462
Vaxrank Rubinsteyn, 2017, Front Immunol https://doi.org/10.3389/fimmu.2017.01807
Priority score Bjerregaard et al, 2017, Cancer Immunol Immunother. https://doi.org/10.1007/s00262-017-2001-3
PRIME § Schmidt et al., 2021, Cell Reports Medicine https://doi.org/10.1016/j.xcrm.2021.100194
HEX § Chiaro et al., 2021, Cancer Immunology Research https://doi.org/10.1158/2326-6066.CIR-20-0814

§ currently not supported for mouse

2 NeoFox Requirements

NeoFox depends on the following tools:

Install from PyPI:

pip install neofox

Or install from bioconda:

conda install bioconda::neofox

3 Usage from the command line

NeoFox can be used from the command line as shown below or programmatically (see https://neofox.readthedocs.io for more information).

neofox --input-file neoantigens_candidates.tsv \
    --patient-data patient_data.txt \
    --output-folder /path/to/out \
    [--output-prefix out_prefix]  \
    [--organism human|mouse]  \
    [--rank-mhci-threshold 2.0] \
    [--rank-mhcii-threshold 5.0] \
    [--num-cpus] \
    [--config] \
    [--patient-id] \
    [--with-all-neoepitopes] \
    [--verbose]

The optional config file with the paths to the dependencies can look like this:

NEOFOX_REFERENCE_FOLDER=path/to/reference/folder
NEOFOX_BLASTP=path/to/ncbi-blast-2.10.1+/bin/blastp
NEOFOX_NETMHCPAN=path/to/netMHCpan-4.1/netMHCpan
NEOFOX_NETMHC2PAN=path/to/netMHCIIpan-4.3/netMHCIIpan
NEOFOX_MIXMHCPRED=path/to/MixMHCpred-2.2/MixMHCpred
NEOFOX_MIXMHC2PRED=path/to/MixMHC2pred-2.0.1/MixMHC2pred_unix
NEOFOX_MAKEBLASTDB=path/to/ncbi-blast-2.8.1+/bin/makeblastdb
NEOFOX_PRIME=/path/to/PRIME-2.0/PRIME

4 Input data

4.1 Neoantigen candidates in tabular format

This is an dummy example of a table with neoantigen candidates:

gene wildTypeXmer mutatedXmer patientIdentifier rnaExpression rnaVariantAlleleFrequency dnaVariantAlleleFrequency external_annotation_1 external_annotation_2
BRCA2 AAAAAAAAAAAAALAAAAAAAAAAAAA AAAAAAAAAAAAAFAAAAAAAAAAAAA Ptx 7.942 0.85 0.34 some_value some_value
BRCA2 AAAAAAAAAAAAAMAAAAAAAAAAAAA AAAAAAAAAAAAARAAAAAAAAAAAAA Ptx 7.942 0.85 0.34 some_value some_value
BRCA2 AAAAAAAAAAAAAGAAAAAAAAAAAAA AAAAAAAAAAAAAKAAAAAAAAAAAAA Ptx 7.942 0.85 0.34 some_value some_value
BRCA2 AAAAAAAAAAAAACAAAAAAAAAAAAA AAAAAAAAAAAAAEAAAAAAAAAAAAA Ptx 7.942 0.85 0.34 some_value some_value
BRCA2 AAAAAAAAAAAAAKAAAAAAAAAAAAA AAAAAAAAAAAAACAAAAAAAAAAAAA Ptx 7.942 0.85 0.34 some_value some_value

where:

NOTE: If rnaExpression is not provided, expression will be estimated by gene expression in a respective TCGA cohort and this value will be used for relevant features. The TCGA cohort to be used for imputation of gene expression needs to be indicated in the tumorType in the patient data (see below). If tumorType is not provided, expression will not be imputed.

4.2 Neoantigen candidates in JSON format

Besides tabular format, neoantigen candidates can be provided as a list of neoantigen models in JSON format as shown below. To simplify, only one full neoantigen model is shown:

[{
    "patientIdentifier": "Ptx",
    "gene": "BRCA2",
    "mutation": {
        "wildTypeXmer": "AAAAAAAAAAAAALAAAAAAAAAAAAA",
        "mutatedXmer": "AAAAAAAAAAAAAFAAAAAAAAAAAAA"
    }
}]

4.3 Patient-data format

This is an dummy example of a patient file:

identifier mhcIAlleles mhcIIAlleles tumorType
Ptx HLA-A*03:01,HLA-A*29:02,HLA-B*07:02,HLA-B*44:03,HLA-C*07:02,HLA-C*16:01 HLA-DRB1*03:01,HLA-DRB1*08:01,HLA-DQA1*03:01,HLA-DQA1*05:01,HLA-DQB1*01:01,HLA-DQB1*04:02,HLA-DPA1*01:03,HLA-DPA1*03:01,HLA-DPB1*13:01,HLA-DPB1*04:02 HNSC
Pty HLA-A*02:01,HLA-A*30:01,HLA-B*07:34,HLA-B*44:03,HLA-C*07:02,HLA-C*07:02 HLA-DRB1*04:02,HLA-DRB1*08:01,HLA-DQA1*03:01,HLA-DQA1*04:01,HLA-DQB1*03:02,HLA-DQB1*14:01,HLA-DPA1*01:03,HLA-DPA1*02:01,HLA-DPB1*02:01,HLA-DPB1*04:01 HNSC

where:

5 Output data

The output data is returned by default in tsv and json format. With the command line flag --with-all-neoepitopes, two additional files are generated containing the epitope candidates for MHCI and MHCII with NetMHCpan predictions below the given thresholds.