icbi-lab / nextNEOpi

nextNEOpi: a comprehensive pipeline for computational neoantigen prediction
Other
67 stars 23 forks source link

Does "nextNEOpi_1.4_resources.tar.gz" have the blast references required by nextNEOpi? #76

Closed Xiao-Zhong closed 3 months ago

Xiao-Zhong commented 3 months ago

Hi, I'm trying running nextNEOpi and encountered the issue below:

ERROR: Resource file does not exist: /group/ms002/software/nextNEOpi/resources/references/blast/ Please check the references resource file settings in conf/resources.config

So I checked the 'resources' which was downloaded following the instruction (https://apps-01.i-med.ac.at/resources/nextneopi/nextNEOpi_1.4_resources.tar.gz), however cannot find the 'blast' directory except the directories, shown as below. . ├── databases │   ├── cosmic │   ├── GATKresourceBundle │   │   └── Mutect2 │   │   ├── GetPileupSummaries │   │   └── gnomAD │   ├── iedb │   ├── mhcflurry_data │   └── vep_cache ├── ExomeCaptureKits │   ├── Agilent │   │   ├── hg19 │   │   └── hg38 │   └── Twist │   └── hg38 └── references ├── ASCAT ├── hg38 │   ├── annotation │   └── gdc │   └── GRCh38.d1.vd1 │   ├── fasta │   │   └── chromosomes │   └── index │   ├── bwa │   └── star ├── Sequenza └── yara

Could you explain where to download the data? Thank you very much!

Regards, Xiao

riederd commented 3 months ago

Hi, I just checked the contents of the archive, and could not identify anything that is missing. THe blast db is there.

You might try to download it again.

[root@apps-01 nextneopi]# md5sum nextNEOpi_1.4_resources.tar.gz 
25531de14cadb0c44bb4a08e10bd0dd4  nextNEOpi_1.4_resources.tar.gz

[root@apps-01 nextneopi]# tree
.
├── databases
│   ├── cosmic
│   │   └── README.txt
│   ├── GATKresourceBundle
│   │   ├── 1000G_phase1.snps.high_confidence.hg38.vcf.gz
│   │   ├── 1000G_phase1.snps.high_confidence.hg38.vcf.gz.tbi
│   │   ├── hapmap_3.3.hg38.vcf.gz
│   │   ├── hapmap_3.3.hg38.vcf.gz.tbi
│   │   ├── Homo_sapiens_assembly38.dbsnp138.vcf
│   │   ├── Homo_sapiens_assembly38.dbsnp138.vcf.idx
│   │   ├── Homo_sapiens_assembly38.known_indels.vcf
│   │   ├── Homo_sapiens_assembly38.known_indels.vcf.idx
│   │   ├── Mills_and_1000G_gold_standard.indels.hg38.vcf
│   │   ├── Mills_and_1000G_gold_standard.indels.hg38.vcf.idx
│   │   └── Mutect2
│   │       ├── GetPileupSummaries
│   │       │   ├── small_exac_common_3.hg38.vcf
│   │       │   └── small_exac_common_3.hg38.vcf.idx
│   │       └── gnomAD
│   │           ├── af-only-gnomad.hg38.vcf.gz
│   │           └── af-only-gnomad.hg38.vcf.gz.tbi
│   ├── iedb
│   ├── mhcflurry_data
│   └── vep_cache
├── ExomeCaptureKits
│   ├── Agilent
│   │   ├── hg19
│   │   │   ├── S31285117_Covered.bed
│   │   │   ├── S31285117_hs_hg19.zip
│   │   │   └── S31285117_Regions.bed
│   │   └── hg38
│   │       ├── S07604514_Covered_ann.bed
│   │       ├── S07604514_Covered.bed -> S07604514_Covered_ann.bed
│   │       ├── S07604514_Covered_full_ann.bed
│   │       ├── S07604514_hs_hg38.zip
│   │       ├── S07604514_Regions.bed
│   │       ├── S07604514_Targets.txt
│   │       ├── S31285117_Covered_ann.bed
│   │       ├── S31285117_Covered.bed -> S31285117_Covered_ann.bed
│   │       ├── S31285117_Covered_full_ann.bed
│   │       ├── S31285117_hs_hg38.zip
│   │       └── S31285117_Regions.bed
│   └── Twist
│       └── hg38
│           ├── Twist_ComprehensiveExome_baits_hg38.bed -> Twist_ComprehensiveExome_targets_hg38.bed
│           └── Twist_ComprehensiveExome_targets_hg38.bed
└── references
    ├── ASCAT
    │   ├── GC_IlluminaOmni2-5-8v1-4_ICBI_hg38_4_WES_ascat.txt
    │   └── hg38SNPpos_InfiniumOmni2-5-8v1-4_chr.loci
    ├── blast
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa.pdb
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa.phr
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa.pin
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa.pog
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa.pos
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa.pot
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa.psq
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa.ptf
    │   ├── GCF_000001405.39_GRCh38.p13_protein.faa.pto
    │   ├── hs_refseq_uniprot.pal
    │   ├── uniprot-proteome_UP000005640.fasta
    │   ├── uniprot-proteome_UP000005640.fasta.pdb
    │   ├── uniprot-proteome_UP000005640.fasta.phr
    │   ├── uniprot-proteome_UP000005640.fasta.pin
    │   ├── uniprot-proteome_UP000005640.fasta.pog
    │   ├── uniprot-proteome_UP000005640.fasta.pos
    │   ├── uniprot-proteome_UP000005640.fasta.pot
    │   ├── uniprot-proteome_UP000005640.fasta.psq
    │   ├── uniprot-proteome_UP000005640.fasta.ptf
    │   └── uniprot-proteome_UP000005640.fasta.pto
    ├── hg38
    │   ├── annotation
    │   │   ├── gencode.v33.primary_assembly.annotation.exon_merged.bed
    │   │   └── gencode.v33.primary_assembly.annotation.gtf
    │   └── gdc
    │       └── GRCh38.d1.vd1
    │           ├── fasta
    │           │   ├── chromosomes
    │           │   │   ├── chr10.fa
    │           │   │   ├── chr11.fa
    │           │   │   ├── chr12.fa
    │           │   │   ├── chr13.fa
    │           │   │   ├── chr14.fa
    │           │   │   ├── chr15.fa
    │           │   │   ├── chr16.fa
    │           │   │   ├── chr17.fa
    │           │   │   ├── chr18.fa
    │           │   │   ├── chr19.fa
    │           │   │   ├── chr1.fa
    │           │   │   ├── chr20.fa
    │           │   │   ├── chr21.fa
    │           │   │   ├── chr22.fa
    │           │   │   ├── chr2.fa
    │           │   │   ├── chr3.fa
    │           │   │   ├── chr4.fa
    │           │   │   ├── chr5.fa
    │           │   │   ├── chr6.fa
    │           │   │   ├── chr7.fa
    │           │   │   ├── chr8.fa
    │           │   │   ├── chr9.fa
    │           │   │   ├── chrX.fa
    │           │   │   └── chrY.fa
    │           │   ├── GRCh38.d1.vd1.dict
    │           │   ├── GRCh38.d1.vd1.fa
    │           │   ├── GRCh38.d1.vd1.fa.fai
    │           │   ├── hg38.len
    │           │   └── info.txt
    │           └── index
    │               ├── bwa
    │               │   ├── GRCh38.d1.vd1_BWA.tar.gz
    │               │   ├── GRCh38.d1.vd1.fa.amb
    │               │   ├── GRCh38.d1.vd1.fa.ann
    │               │   ├── GRCh38.d1.vd1.fa.bwt
    │               │   ├── GRCh38.d1.vd1.fa.pac
    │               │   └── GRCh38.d1.vd1.fa.sa
    │               └── star
    │                   ├── chrLength.txt
    │                   ├── chrNameLength.txt
    │                   ├── chrName.txt
    │                   ├── chrStart.txt
    │                   ├── exonGeTrInfo.tab
    │                   ├── exonInfo.tab
    │                   ├── geneInfo.tab
    │                   ├── Genome
    │                   ├── genomeParameters.txt
    │                   ├── SA
    │                   ├── SAindex
    │                   ├── sjdbInfo.txt
    │                   ├── sjdbList.fromGTF.out.tab
    │                   ├── sjdbList.out.tab
    │                   └── transcriptInfo.tab
    ├── Sequenza
    │   └── GRCh38.d1.vd1.gc50Base_3.0.txt.gz
    └── yara
        ├── hla_reference_dna.lf.drp
        ├── hla_reference_dna.lf.drs
        ├── hla_reference_dna.lf.drv
        ├── hla_reference_dna.lf.pst
        ├── hla_reference_dna.rid.concat
        ├── hla_reference_dna.rid.limits
        ├── hla_reference_dna.sa.ind
        ├── hla_reference_dna.sa.len
        ├── hla_reference_dna.sa.val
        ├── hla_reference_dna.txt.concat
        ├── hla_reference_dna.txt.limits
        ├── hla_reference_dna.txt.size
        ├── hla_reference_rna.lf.drp
        ├── hla_reference_rna.lf.drs
        ├── hla_reference_rna.lf.drv
        ├── hla_reference_rna.lf.pst
        ├── hla_reference_rna.rid.concat
        ├── hla_reference_rna.rid.limits
        ├── hla_reference_rna.sa.ind
        ├── hla_reference_rna.sa.len
        ├── hla_reference_rna.sa.val
        ├── hla_reference_rna.txt.concat
        ├── hla_reference_rna.txt.limits
        └── hla_reference_rna.txt.size

29 directories, 131 files
Xiao-Zhong commented 3 months ago

Thanks! Problem solved after downloaded the data again.