Ensembl / ensembl-vep

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants
https://www.ensembl.org/vep
Apache License 2.0
456 stars 152 forks source link

nextflow vep #1408

Closed sheucke closed 12 months ago

sheucke commented 1 year ago

then use it in command:

nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf $PWD/examples/clinvar-testset/compressed.vcf.gz -profile standard N E X T F L O W ~ version 22.10.7 Launching workflows/run_vep.nf [jovial_keller] DSL2 - revision: b14a96d543 The specified VCF file has issues in parsing: tbx_index_build failed: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/compressed.vcf.gz

olaaustine commented 1 year ago

Hi @sheucke, Thank you for your query. From the error message, it looks like there is an issue with the VCF file and it can not be tabixed. Please can you run tabix on your vcf file first to check if there is an error. Thank you Ola.

sheucke commented 1 year ago

Hi @olaaustine,

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinva(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-tests et$ tabix -p vcf compressed.vcf.gz

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset$ nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf $PWD/examples/clinvar-testset/compressed.vcf.gz -profile standard N E X T F L O W ~ version 22.10.7 Pulling nextflow-io/workflows ... Remote resource not found: https://api.github.com/repos/nextflow-io/workflows/contents/run_vep.nf

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset$ cd ../..

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf $PWD/examples/clinvar-testset/compressed.vcf.gz -profile standard N E X T F L O W ~ version 22.10.7 Launching workflows/run_vep.nf [desperate_newton] DSL2 - revision: b14a96d543 The specified VCF file has issues in parsing: tbx_index_build failed: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/compressed.vcf.gz

As you can see tabix does not give me an error I do get a new file compressed.vcf.gz.tbi. But the nextflow command still fails.

I saw a post here https://www.biostars.org/p/9532400/ , the person got the same error.

best regards sebastian

olaaustine commented 1 year ago

Hi @sheucke, Thank you very much for your response. I am unable to reproduce this error. Can you confirm using an example vcf file from here Thank you Ola

sheucke commented 1 year ago

Hi @olaaustine,

I tried with the homo sapiens GRCh38.vcf. Here the log, with the error.

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ bgzip -c ./examples/clinvar-testset/homo_sapiens_GRCh38.vcf > ./examples/clinvar-testset/homo_sapiens_GRCh38.vcf.gz (base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ tabix -p vcf homo_sapiens_GRCh38.vcf.gz [E::hts_idx_push] Chromosome blocks not continuous tbx_index_build failed: homo_sapiens_GRCh38.vcf.gz (base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$

olaaustine commented 1 year ago

Hi @sheucke, Can you try running nextflow on the bgzipped VCF file? Also please how was nextflow/vep installed? Thank you Ola.

sheucke commented 1 year ago

Hi @olaaustine,

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep$ bgzip -c /mnt/volume/ensembl_vep/ensembl-vep/examples/homo_sapiens_GRCh38.vcf > /mnt/volume/ensembl_vep/ensembl-vep/examples/homo_sapiens_GRCh38.vcf.gz

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf /mnt/volume/ensembl_vep/ensembl-vep/examples/homo_sapiens_GRCh38.vcf.gz -profile standard N E X T F L O W ~ version 22.10.7 Launching workflows/run_vep.nf [gloomy_meucci] DSL2 - revision: b14a96d543 The specified VCF file has issues in parsing: tbx_index_build failed: /mnt/volume/ensembl_vep/ensembl-vep/examples/homo_sapiens_GRCh38.vcf.gz

For the installation I followed the instruction in the nextflow/README.md after a "git clone https://github.com/Ensembl/ensembl-vep.git":

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ nextflow -version

  N E X T F L O W
  version 22.10.7 build 5853
  created 18-02-2023 20:32 UTC 
  cite doi:10.1038/nbt.3820
  http://nextflow.io

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ singularity --version singularity-ce version 3.9.7-bionic

./setup-images.sh
cp nf_config/vep.ini.template nf_config/vep.ini
cache 1
dir_cache '/mnt/volume/ensembl_vep/local_vep_cache_dir_tmp/homo_sapiens/109_GRCh38'
assembly 'GRCh38'
offline 1
force_overwrite 1
params.singularity_dir = "$PWD/singularity-images"
profiles {
  standard {
    process.executor = 'local'
    process.memory = '5GB'
    process.cpus = 1
    singularity {
      enabled = true
      autoMounts = true

    }
  }

  lsf {
    process.executor = 'lsf'
    process.memory = '5GB'
    process.cpus = 1
    process.clusterOptions = '-R "select[mem>5000] rusage[mem=5000]" -M5000'
    singularity {
      enabled = true
      autoMounts = true
    }
  }

  //untested 
  slurm {
    process.executor = 'slurm'
    process.memory = '5GB' 
    process.cpus = 1
    process.clusterOptions = '--mem=5G'
    singularity {
      enabled = true
      autoMounts = true
    }
  }  
}

//params.chros = "1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,X,Y,MT"
params.chros_file = "$PWD/examples/clinvar-testset/chros.txt"
params.vep_config = "$PWD/nf_config/vep.ini"
params.output_prefix = ""

Then I tried the example but this gave the same errors as the other vcf files I tested.

best regards Sebastian

olaaustine commented 1 year ago

Hi @sheucke, Thank you very much for your very prompt response. Firstly, are you running nextflow from the ensembl-vep/nextflow directory? Secondly, if you are, can you comment this params.chros_file from your nextflow.config file? Thank you, Ola.

sheucke commented 1 year ago

Hi @olaaustine,

if I would not start from ensembl/nextflow then I will get this error:

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep$ nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf /mnt/volume/ensembl_vep/ensembl-vep/examples/homo_sapiens_GRCh38.vcf.gz -profile standard N E X T F L O W ~ version 22.10.7 Pulling nextflow-io/workflows ... Remote resource not found: https://api.github.com/repos/nextflow-io/workflows/contents/run_vep.nf (base) ubuntu@gpu-image:/mnt/volume/ensembl_vep$

chros.txt:

1
2

The Chros file does only have Chromosom 1 and 2, which are not in the homo_sapiens_GRCh38.vcf.gz. I did run with only 22 but with the same result:

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf /mnt/volume/ensembl_vep/ensembl-vep/examples/homo_sapiens_GRCh38.vcf.gz -profile standard N E X T F L O W ~ version 22.10.7 Launching workflows/run_vep.nf [crazy_kilby] DSL2 - revision: b14a96d543 The specified VCF file has issues in parsing: tbx_index_build failed: /mnt/volume/ensembl_vep/ensembl-vep/examples/homo_sapiens_GRCh38.vcf.gz

kind regards sebastian

sheucke commented 1 year ago

Hi @olaaustine,

I tried something else. I did use this tool here: https://github.com/EBIvariation/vcf-validator

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##FILTER=<ID=RefCall,Description="Genotyping model thinks this site is reference.">
##FILTER=<ID=LowQual,Description="Confidence in this variant being real is below calling threshold.">
##FILTER=<ID=NoCall,Description="Site has depth=0 resulting in no call.">
##INFO=<ID=END,Number=1,Type=Integer,Description="End position (for use with symbolic alleles)">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Conditional genotype quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read depth">
##FORMAT=<ID=MIN_DP,Number=1,Type=Integer,Description="Minimum DP observed within the GVCF block.">
##FORMAT=<ID=AD,Number=R,Type=Integer,Description="Read depth for each allele">
##FORMAT=<ID=VAF,Number=A,Type=Float,Description="Variant allele fractions.">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="Phred-scaled genotype likelihoods rounded to the closest integer">
##FORMAT=<ID=MED_DP,Number=1,Type=Integer,Description="Median DP observed within the GVCF block rounded to the nearest integer.">
##DeepVariant_version=1.3.0
##contig=<ID=chr1,length=248956422>
##contig=<ID=chr2,length=242193529>
##contig=<ID=chr3,length=198295559>
##contig=<ID=chr4,length=190214555>
##contig=<ID=chr5,length=181538259>
##contig=<ID=chr6,length=170805979>
##contig=<ID=chr7,length=159345973>
##contig=<ID=chr8,length=145138636>
##contig=<ID=chr9,length=138394717>
##contig=<ID=chr10,length=133797422>
##contig=<ID=chr11,length=135086622>
##contig=<ID=chr12,length=133275309>
##contig=<ID=chr13,length=114364328>
##contig=<ID=chr14,length=107043718>
##contig=<ID=chr15,length=101991189>
##contig=<ID=chr16,length=90338345>
##contig=<ID=chr17,length=83257441>
##contig=<ID=chr18,length=80373285>
##contig=<ID=chr19,length=58617616>
##contig=<ID=chr20,length=64444167>
##contig=<ID=chr21,length=46709983>
##contig=<ID=chr22,length=50818468>
##contig=<ID=chrX,length=156040895>
##contig=<ID=chrY,length=57227415>
##contig=<ID=chrM,length=16569>
##contig=<ID=chr1_KI270706v1_random,length=175055>
##contig=<ID=chr1_KI270707v1_random,length=32032>
##contig=<ID=chr1_KI270708v1_random,length=127682>
##contig=<ID=chr1_KI270709v1_random,length=66860>
##contig=<ID=chr1_KI270710v1_random,length=40176>
##contig=<ID=chr1_KI270711v1_random,length=42210>
##contig=<ID=chr1_KI270712v1_random,length=176043>
##contig=<ID=chr1_KI270713v1_random,length=40745>
##contig=<ID=chr1_KI270714v1_random,length=41717>
##contig=<ID=chr2_KI270715v1_random,length=161471>
##contig=<ID=chr2_KI270716v1_random,length=153799>
##contig=<ID=chr3_GL000221v1_random,length=155397>
##contig=<ID=chr4_GL000008v2_random,length=209709>
##contig=<ID=chr5_GL000208v1_random,length=92689>
##contig=<ID=chr9_KI270717v1_random,length=40062>
##contig=<ID=chr9_KI270718v1_random,length=38054>
##contig=<ID=chr9_KI270719v1_random,length=176845>
##contig=<ID=chr9_KI270720v1_random,length=39050>
##contig=<ID=chr11_KI270721v1_random,length=100316>
##contig=<ID=chr14_GL000009v2_random,length=201709>
##contig=<ID=chr14_GL000225v1_random,length=211173>
##contig=<ID=chr14_KI270722v1_random,length=194050>
##contig=<ID=chr14_GL000194v1_random,length=191469>
##contig=<ID=chr14_KI270723v1_random,length=38115>
##contig=<ID=chr14_KI270724v1_random,length=39555>
##contig=<ID=chr14_KI270725v1_random,length=172810>
##contig=<ID=chr14_KI270726v1_random,length=43739>
##contig=<ID=chr15_KI270727v1_random,length=448248>
##contig=<ID=chr16_KI270728v1_random,length=1872759>
##contig=<ID=chr17_GL000205v2_random,length=185591>
##contig=<ID=chr17_KI270729v1_random,length=280839>
##contig=<ID=chr17_KI270730v1_random,length=112551>
##contig=<ID=chr22_KI270731v1_random,length=150754>
##contig=<ID=chr22_KI270732v1_random,length=41543>
##contig=<ID=chr22_KI270733v1_random,length=179772>
##contig=<ID=chr22_KI270734v1_random,length=165050>
##contig=<ID=chr22_KI270735v1_random,length=42811>
##contig=<ID=chr22_KI270736v1_random,length=181920>
##contig=<ID=chr22_KI270737v1_random,length=103838>
##contig=<ID=chr22_KI270738v1_random,length=99375>
##contig=<ID=chr22_KI270739v1_random,length=73985>
##contig=<ID=chrY_KI270740v1_random,length=37240>
##contig=<ID=chrUn_KI270302v1,length=2274>
##contig=<ID=chrUn_KI270304v1,length=2165>
##contig=<ID=chrUn_KI270303v1,length=1942>
##contig=<ID=chrUn_KI270305v1,length=1472>
##contig=<ID=chrUn_KI270322v1,length=21476>
##contig=<ID=chrUn_KI270320v1,length=4416>
##contig=<ID=chrUn_KI270310v1,length=1201>
##contig=<ID=chrUn_KI270316v1,length=1444>
##contig=<ID=chrUn_KI270315v1,length=2276>
##contig=<ID=chrUn_KI270312v1,length=998>
##contig=<ID=chrUn_KI270311v1,length=12399>
##contig=<ID=chrUn_KI270317v1,length=37690>
##contig=<ID=chrUn_KI270412v1,length=1179>
##contig=<ID=chrUn_KI270411v1,length=2646>
##contig=<ID=chrUn_KI270414v1,length=2489>
##contig=<ID=chrUn_KI270419v1,length=1029>
##contig=<ID=chrUn_KI270418v1,length=2145>
##contig=<ID=chrUn_KI270420v1,length=2321>
##contig=<ID=chrUn_KI270424v1,length=2140>
##contig=<ID=chrUn_KI270417v1,length=2043>
##contig=<ID=chrUn_KI270422v1,length=1445>
##contig=<ID=chrUn_KI270423v1,length=981>
##contig=<ID=chrUn_KI270425v1,length=1884>
##contig=<ID=chrUn_KI270429v1,length=1361>
##contig=<ID=chrUn_KI270442v1,length=392061>
##contig=<ID=chrUn_KI270466v1,length=1233>
##contig=<ID=chrUn_KI270465v1,length=1774>
##contig=<ID=chrUn_KI270467v1,length=3920>
##contig=<ID=chrUn_KI270435v1,length=92983>
##contig=<ID=chrUn_KI270438v1,length=112505>
##contig=<ID=chrUn_KI270468v1,length=4055>
##contig=<ID=chrUn_KI270510v1,length=2415>
##contig=<ID=chrUn_KI270509v1,length=2318>
##contig=<ID=chrUn_KI270518v1,length=2186>
##contig=<ID=chrUn_KI270508v1,length=1951>
##contig=<ID=chrUn_KI270516v1,length=1300>
##contig=<ID=chrUn_KI270512v1,length=22689>
##contig=<ID=chrUn_KI270519v1,length=138126>
##contig=<ID=chrUn_KI270522v1,length=5674>
##contig=<ID=chrUn_KI270511v1,length=8127>
##contig=<ID=chrUn_KI270515v1,length=6361>
##contig=<ID=chrUn_KI270507v1,length=5353>
##contig=<ID=chrUn_KI270517v1,length=3253>
##contig=<ID=chrUn_KI270529v1,length=1899>
##contig=<ID=chrUn_KI270528v1,length=2983>
##contig=<ID=chrUn_KI270530v1,length=2168>
##contig=<ID=chrUn_KI270539v1,length=993>
##contig=<ID=chrUn_KI270538v1,length=91309>
##contig=<ID=chrUn_KI270544v1,length=1202>
##contig=<ID=chrUn_KI270548v1,length=1599>
##contig=<ID=chrUn_KI270583v1,length=1400>
##contig=<ID=chrUn_KI270587v1,length=2969>
##contig=<ID=chrUn_KI270580v1,length=1553>
##contig=<ID=chrUn_KI270581v1,length=7046>
##contig=<ID=chrUn_KI270579v1,length=31033>
##contig=<ID=chrUn_KI270589v1,length=44474>
##contig=<ID=chrUn_KI270590v1,length=4685>
##contig=<ID=chrUn_KI270584v1,length=4513>
##contig=<ID=chrUn_KI270582v1,length=6504>
##contig=<ID=chrUn_KI270588v1,length=6158>
##contig=<ID=chrUn_KI270593v1,length=3041>
##contig=<ID=chrUn_KI270591v1,length=5796>
##contig=<ID=chrUn_KI270330v1,length=1652>
##contig=<ID=chrUn_KI270329v1,length=1040>
##contig=<ID=chrUn_KI270334v1,length=1368>
##contig=<ID=chrUn_KI270333v1,length=2699>
##contig=<ID=chrUn_KI270335v1,length=1048>
##contig=<ID=chrUn_KI270338v1,length=1428>
##contig=<ID=chrUn_KI270340v1,length=1428>
##contig=<ID=chrUn_KI270336v1,length=1026>
##contig=<ID=chrUn_KI270337v1,length=1121>
##contig=<ID=chrUn_KI270363v1,length=1803>
##contig=<ID=chrUn_KI270364v1,length=2855>
##contig=<ID=chrUn_KI270362v1,length=3530>
##contig=<ID=chrUn_KI270366v1,length=8320>
##contig=<ID=chrUn_KI270378v1,length=1048>
##contig=<ID=chrUn_KI270379v1,length=1045>
##contig=<ID=chrUn_KI270389v1,length=1298>
##contig=<ID=chrUn_KI270390v1,length=2387>
##contig=<ID=chrUn_KI270387v1,length=1537>
##contig=<ID=chrUn_KI270395v1,length=1143>
##contig=<ID=chrUn_KI270396v1,length=1880>
##contig=<ID=chrUn_KI270388v1,length=1216>
##contig=<ID=chrUn_KI270394v1,length=970>
##contig=<ID=chrUn_KI270386v1,length=1788>
##contig=<ID=chrUn_KI270391v1,length=1484>
##contig=<ID=chrUn_KI270383v1,length=1750>
##contig=<ID=chrUn_KI270393v1,length=1308>
##contig=<ID=chrUn_KI270384v1,length=1658>
##contig=<ID=chrUn_KI270392v1,length=971>
##contig=<ID=chrUn_KI270381v1,length=1930>
##contig=<ID=chrUn_KI270385v1,length=990>
##contig=<ID=chrUn_KI270382v1,length=4215>
##contig=<ID=chrUn_KI270376v1,length=1136>
##contig=<ID=chrUn_KI270374v1,length=2656>
##contig=<ID=chrUn_KI270372v1,length=1650>
##contig=<ID=chrUn_KI270373v1,length=1451>
##contig=<ID=chrUn_KI270375v1,length=2378>
##contig=<ID=chrUn_KI270371v1,length=2805>
##contig=<ID=chrUn_KI270448v1,length=7992>
##contig=<ID=chrUn_KI270521v1,length=7642>
##contig=<ID=chrUn_GL000195v1,length=182896>
##contig=<ID=chrUn_GL000219v1,length=179198>
##contig=<ID=chrUn_GL000220v1,length=161802>
##contig=<ID=chrUn_GL000224v1,length=179693>
##contig=<ID=chrUn_KI270741v1,length=157432>
##contig=<ID=chrUn_GL000226v1,length=15008>
##contig=<ID=chrUn_GL000213v1,length=164239>
##contig=<ID=chrUn_KI270743v1,length=210658>
##contig=<ID=chrUn_KI270744v1,length=168472>
##contig=<ID=chrUn_KI270745v1,length=41891>
##contig=<ID=chrUn_KI270746v1,length=66486>
##contig=<ID=chrUn_KI270747v1,length=198735>
##contig=<ID=chrUn_KI270748v1,length=93321>
##contig=<ID=chrUn_KI270749v1,length=158759>
##contig=<ID=chrUn_KI270750v1,length=148850>
##contig=<ID=chrUn_KI270751v1,length=150742>
##contig=<ID=chrUn_KI270752v1,length=27745>
##contig=<ID=chrUn_KI270753v1,length=62944>
##contig=<ID=chrUn_KI270754v1,length=40191>
##contig=<ID=chrUn_KI270755v1,length=36723>
##contig=<ID=chrUn_KI270756v1,length=79590>
##contig=<ID=chrUn_KI270757v1,length=71251>
##contig=<ID=chrUn_GL000214v1,length=137718>
##contig=<ID=chrUn_KI270742v1,length=186739>
##contig=<ID=chrUn_GL000216v2,length=176608>
##contig=<ID=chrUn_GL000218v1,length=161147>
##contig=<ID=chrEBV,length=171823>
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  default
chr4    35816151    .   C   CT  0.1 RefCall .   GT:GQ:DP:AD:VAF:PL  ./.:15:71:54,16:0.225352:0,23,15
chr4    35816158    .   C   A   1.8 RefCall .   GT:GQ:DP:AD:VAF:PL  ./.:5:68:0,68:1:0,26,2
chr4    35816161    .   C   A   2.8 RefCall .   GT:GQ:DP:AD:VAF:PL  ./.:3:74:0,73:0.986486:0,22,0
chr4    35816162    .   TC  T   0.6 RefCall .   GT:GQ:DP:AD:VAF:PL  ./.:9:75:2,73:0.973333:0,26,8
chr4    35816165    .   C   G   0.7 RefCall .   GT:GQ:DP:AD:VAF:PL  ./.:8:88:1,70:0.795455:0,30,7
chr4    96003882    .   T   C   7.8 PASS    .   GT:GQ:DP:AD:VAF:PL  1/1:7:2:0,2:1:6,12,0
chr7    55174675    .   C   G   0.3 RefCall .   GT:GQ:DP:AD:VAF:PL  ./.:12:9:0,5:0.555556:0,23,12
chr7    55174705    .   TTC T   0   RefCall .   GT:GQ:DP:AD:VAF:PL  0/0:27:1444:1286,138:0.0955679:0,30,29
chr7    55174771    .   AGGAATTAAGAGAAGC    A   0   RefCall .   GT:GQ:DP:AD:VAF:PL  0/0:22:1468:1315,149:0.101499:0,22,28
chr7    55181370    .   G   A   1   RefCall .   GT:GQ:DP:AD:VAF:PL  ./.:7:1429:766,655:0.458362:0,9,8
chr7    140753381   .   T   A   0.4 RefCall .   GT:GQ:DP:AD:VAF:PL  ./.:10:22:16,3:0.136364:0,23,9
chr8    46839210    .   TTC T   5.3 PASS    .   GT:GQ:DP:AD:VAF:PL  1/1:5:2:0,2:1:3,11,0
chr8    46839216    .   A   T   6.5 PASS    .   GT:GQ:DP:AD:VAF:PL  1/1:6:2:0,2:1:5,13,0
chr8    46839218    .   A   C   8.9 PASS    .   GT:GQ:DP:AD:VAF:PL  1/1:8:2:0,2:1:8,13,0
chrX    75584599    .   T   C   0.3 RefCall .   GT:GQ:DP:AD:VAF:PL  ./.:12:8:6,2:0.25:0,14,15

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ ./examples/clinvar-testset/vcf_validator_linux -i ./examples/clinvar-testset/KO23.85_R1_R1.vcf [info] Reading from input file... [info] Summary report written to : ./examples/clinvar-testset/KO23.85_R1_R1.vcf.errors_summary.1683099622703.txt [info] According to the VCF specification, the input file is valid

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ bgzip ./examples/clinvar-testset/KO23.85_R1_R1.vcf

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ ./examples/clinvar-testset/vcf_validator_linux -i ./examples/clinvar-testset/KO23.85_R1_R1.vcf.gz [info] Reading from input file... [warning] Detected .gz compression [info] Summary report written to : ./examples/clinvar-testset/KO23.85_R1_R1.vcf.gz.errors_summary.1683099699606.txt [info] According to the VCF specification, the input file is valid

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/KO23.85_R1_R1.vcf.gz -profile standard N E X T F L O W ~ version 22.10.7 Launching workflows/run_vep.nf [sad_pare] DSL2 - revision: b14a96d543 The specified VCF file has issues in parsing: tbx_index_build failed: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/KO23.85_R1_R1.vcf.gz

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ tabix -p vcf ./examples/clinvar-testset/KO23.85_R1_R1.vcf.gz (base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$

It works with no errors and I have a KO23.85_R1_R1.vcf.gz.tbi in my folder.

I dont understand whats going wrong.

kind regards sebastian

olaaustine commented 1 year ago

Hi @sheucke, Thank you for this detailed information and for taking the steps to resolve this matter. This is not an error we are able to reproduce but can you run which tabix on your command line and let us know what the output as we try to figure out what the problem is Thank you Ola.

sheucke commented 1 year ago

Hi @olaaustine, as requested:

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep$ which tabix /usr/bin/tabix (base) ubuntu@gpu-image:/mnt/volume/ensembl_vep$ which bgzip /usr/bin/bgzip

Thank you. Sebastian

olaaustine commented 1 year ago

Hi @sheucke, Thank you for sharing. Please can you also share your .nextflow.log file from when you run nextflow and get that error. Thank you Ola.

sheucke commented 1 year ago

Hi @olaaustine, sure.

May-03 10:56:32.885 [main] DEBUG nextflow.cli.Launcher - $> nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/KO23.85_R1_R1.vcf.gz -profile standard --chros 1,2 --vep_config
May-03 10:56:32.968 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 22.10.7
May-03 10:56:32.987 [main] DEBUG nextflow.plugin.PluginsFacade - Setting up plugin manager > mode=prod; embedded=false; plugins-dir=/home/ubuntu/.nextflow/plugins; core-plugins: nf-amazon@1.11.4,nf-azure@0.14.2,nf-codecommit@0.1.2,nf-console@1.0.4,nf-ga4gh@1.0.4,nf-google@1.4.5,nf-tower@1.5.6,nf-wave@0.5.4
May-03 10:56:32.996 [main] INFO  org.pf4j.DefaultPluginStatusProvider - Enabled plugins: []
May-03 10:56:32.997 [main] INFO  org.pf4j.DefaultPluginStatusProvider - Disabled plugins: []
May-03 10:56:33.000 [main] INFO  org.pf4j.DefaultPluginManager - PF4J version 3.4.1 in 'deployment' mode
May-03 10:56:33.011 [main] INFO  org.pf4j.AbstractPluginManager - No plugins
May-03 10:56:33.025 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/nf_config/nextflow.config
May-03 10:56:33.045 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `standard`
May-03 10:56:33.677 [main] DEBUG nextflow.cli.CmdRun - Applied DSL=2 from script declararion
May-03 10:56:33.694 [main] INFO  nextflow.cli.CmdRun - Launching `workflows/run_vep.nf` [jovial_becquerel] DSL2 - revision: b14a96d543
May-03 10:56:33.695 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins default=[]
May-03 10:56:33.695 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins resolved requirement=[]
May-03 10:56:33.703 [main] DEBUG nextflow.secret.LocalSecretsProvider - Secrets store: /home/ubuntu/.nextflow/secrets/store.json
May-03 10:56:33.706 [main] DEBUG nextflow.secret.SecretsLoader - Discovered secrets providers: [nextflow.secret.LocalSecretsProvider@127a7272] - activable => nextflow.secret.LocalSecretsProvider@127a7272
May-03 10:56:33.758 [main] DEBUG nextflow.Session - Session UUID: 0f25c6d5-14fa-4ade-83e1-ebd810f7131a
May-03 10:56:33.758 [main] DEBUG nextflow.Session - Run name: jovial_becquerel
May-03 10:56:33.758 [main] DEBUG nextflow.Session - Executor pool size: 16
May-03 10:56:33.768 [main] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'FileTransfer' minSize=10; maxSize=48; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false
May-03 10:56:33.796 [main] DEBUG nextflow.cli.CmdRun - 
  Version: 22.10.7 build 5853
  Created: 18-02-2023 20:32 UTC 
  System: Linux 5.4.0-146-generic
  Runtime: Groovy 3.0.13 on OpenJDK 64-Bit Server VM 11.0.18+10-post-Ubuntu-0ubuntu120.04.1
  Encoding: UTF-8 (UTF-8)
  Process: 436004@gpu-image 
  CPUs: 16 - Mem: 251.7 GB (191 GB) - Swap: 0 (0)
May-03 10:56:33.815 [main] DEBUG nextflow.Session - Work-dir: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/work [ext2/ext3]
May-03 10:56:33.815 [main] DEBUG nextflow.Session - Script base path does not exist or is not a directory: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/workflows/bin
May-03 10:56:33.824 [main] DEBUG nextflow.executor.ExecutorFactory - Extension executors providers=[]
May-03 10:56:33.834 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory
May-03 10:56:33.858 [main] DEBUG nextflow.cache.CacheFactory - Using Nextflow cache factory: nextflow.cache.DefaultCacheFactory
May-03 10:56:33.867 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 17; maxThreads: 1000
May-03 10:56:37.853 [main] DEBUG nextflow.Session - Session start
May-03 10:56:38.243 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
May-03 10:56:38.811 [main] ERROR nextflow.Nextflow - The specified VCF file has issues in parsing: tbx_index_build failed: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/KO23.85_R1_R1.vcf.gz

there you go.

Sebastian

olaaustine commented 1 year ago

Hi @sheucke, To debug this further can you add this block of code to def soute = new StringBuilder(), serre = new StringBuilder() check_tabix = "$params.singularity_dir/vep.sif which tabix".execute() check_tabix.consumeProcessOutput(soute, serre) check_tabix.waitFor() println soute after this line and let us know your output Thank you Ola.

sheucke commented 1 year ago

Hi @olaaustine,

I did what you requested:

 .....
def sout = new StringBuilder(), serr = new StringBuilder()
check_parsing = "$params.singularity_dir/vep.sif tabix -p vcf -f $params.vcf".execute()
check_parsing.consumeProcessOutput(sout, serr)
check_parsing.waitFor()
def soute = new StringBuilder(), serre = new StringBuilder()
check_tabix = "$params.singularity_dir/vep.sif which tabix".execute()
check_tabix.consumeProcessOutput(soute, serre)
check_tabix.waitFor()
println soute
if( serr ){
  exit 1, "The specified VCF file has issues in parsing: $serr"
}
.....

result:

(base) ubuntu@gpu-image:/mnt/volume/ensembl_vep/ensembl-vep/nextflow$ nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/KO23.85_R1_R1.vcf.gz -profile standard
N E X T F L O W  ~  version 22.10.7
Launching `workflows/run_vep.nf` [evil_majorana] DSL2 - revision: 32cdce1e5a
/usr/local/bin/tabix

The specified VCF file has issues in parsing: tbx_index_build failed: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/KO23.85_R1_R1.vcf.gz

nextflow log:

May-10 05:34:55.173 [main] DEBUG nextflow.cli.Launcher - $> nextflow -C nf_config/nextflow.config run workflows/run_vep.nf --vcf /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/KO23.85_R1_R1.vcf.gz -profile standard
May-10 05:34:55.254 [main] INFO  nextflow.cli.CmdRun - N E X T F L O W  ~  version 22.10.7
May-10 05:34:55.273 [main] DEBUG nextflow.plugin.PluginsFacade - Setting up plugin manager > mode=prod; embedded=false; plugins-dir=/home/ubuntu/.nextflow/plugins; core-plugins: nf-amazon@1.11.4,nf-azure@0.14.2,nf-codecommit@0.1.2,nf-console@1.0.4,nf-ga4gh@1.0.4,nf-google@1.4.5,nf-tower@1.5.6,nf-wave@0.5.4
May-10 05:34:55.283 [main] INFO  org.pf4j.DefaultPluginStatusProvider - Enabled plugins: []
May-10 05:34:55.284 [main] INFO  org.pf4j.DefaultPluginStatusProvider - Disabled plugins: []
May-10 05:34:55.291 [main] INFO  org.pf4j.DefaultPluginManager - PF4J version 3.4.1 in 'deployment' mode
May-10 05:34:55.307 [main] INFO  org.pf4j.AbstractPluginManager - No plugins
May-10 05:34:55.322 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/nf_config/nextflow.config
May-10 05:34:55.345 [main] DEBUG nextflow.config.ConfigBuilder - Applying config profile: `standard`
May-10 05:34:55.972 [main] DEBUG nextflow.cli.CmdRun - Applied DSL=2 from script declararion
May-10 05:34:55.987 [main] INFO  nextflow.cli.CmdRun - Launching `workflows/run_vep.nf` [evil_majorana] DSL2 - revision: 32cdce1e5a
May-10 05:34:55.988 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins default=[]
May-10 05:34:55.988 [main] DEBUG nextflow.plugin.PluginsFacade - Plugins resolved requirement=[]
May-10 05:34:55.997 [main] DEBUG nextflow.secret.LocalSecretsProvider - Secrets store: /home/ubuntu/.nextflow/secrets/store.json
May-10 05:34:56.000 [main] DEBUG nextflow.secret.SecretsLoader - Discovered secrets providers: [nextflow.secret.LocalSecretsProvider@76ad6715] - activable => nextflow.secret.LocalSecretsProvider@76ad6715
May-10 05:34:56.049 [main] DEBUG nextflow.Session - Session UUID: b15105a3-c5ce-419c-943c-bad3f6dc0af0
May-10 05:34:56.050 [main] DEBUG nextflow.Session - Run name: evil_majorana
May-10 05:34:56.050 [main] DEBUG nextflow.Session - Executor pool size: 16
May-10 05:34:56.060 [main] DEBUG nextflow.util.ThreadPoolBuilder - Creating thread pool 'FileTransfer' minSize=10; maxSize=48; workQueue=LinkedBlockingQueue[10000]; allowCoreThreadTimeout=false
May-10 05:34:56.090 [main] DEBUG nextflow.cli.CmdRun - 
  Version: 22.10.7 build 5853
  Created: 18-02-2023 20:32 UTC 
  System: Linux 5.4.0-146-generic
  Runtime: Groovy 3.0.13 on OpenJDK 64-Bit Server VM 11.0.18+10-post-Ubuntu-0ubuntu120.04.1
  Encoding: UTF-8 (UTF-8)
  Process: 523591@gpu-image
  CPUs: 16 - Mem: 251.7 GB (190.8 GB) - Swap: 0 (0)
May-10 05:34:56.110 [main] DEBUG nextflow.Session - Work-dir: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/work [ext2/ext3]
May-10 05:34:56.110 [main] DEBUG nextflow.Session - Script base path does not exist or is not a directory: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/workflows/bin
May-10 05:34:56.119 [main] DEBUG nextflow.executor.ExecutorFactory - Extension executors providers=[]
May-10 05:34:56.129 [main] DEBUG nextflow.Session - Observer factory: DefaultObserverFactory
May-10 05:34:56.152 [main] DEBUG nextflow.cache.CacheFactory - Using Nextflow cache factory: nextflow.cache.DefaultCacheFactory
May-10 05:34:56.162 [main] DEBUG nextflow.util.CustomThreadPool - Creating default thread pool > poolSize: 17; maxThreads: 1000
May-10 05:34:57.021 [main] DEBUG nextflow.Session - Session start
May-10 05:34:57.409 [main] DEBUG nextflow.script.ScriptRunner - > Launching execution
May-10 05:34:58.060 [main] INFO  nextflow.script.BaseScript - /usr/local/bin/tabix

May-10 05:34:58.063 [main] ERROR nextflow.Nextflow - The specified VCF file has issues in parsing: tbx_index_build failed: /mnt/volume/ensembl_vep/ensembl-vep/nextflow/examples/clinvar-testset/KO23.85_R1_R1.vcf.g

all the best sebastian

olaaustine commented 1 year ago

Hi @sheucke, Apologies for the delayed response on this. We wanted to find out if this issue still persists when using nextflow vep? Thank you very much Ola.