connor-lab / ncov2019-artic-nf

A Nextflow pipeline for running the ARTIC network's fieldbioinformatics tools (https://github.com/artic-network/fieldbioinformatics), with a focus on ncov2019
GNU Affero General Public License v3.0
90 stars 86 forks source link

readgroup not found in provided primer scheme (1) for V4 scheme #111

Open esha-joshi opened 3 years ago

esha-joshi commented 3 years ago

Hello, I am trying to run the ARTIC pipeline on the Singularity profile with the new ARTIC V4 primer scheme set (https://github.com/artic-network/primer-schemes).

I am running the Nextflow pipeline on Nanopore data with the minimum required arguments and my conf/base.config is set up as such with parameters for the primer scheme:

    // Repo to download your primer scheme from
    schemeRepoURL = 'https://github.com/artic-network/primer-schemes.git'

    // Directory within schemeRepoURL that contains primer schemes
    schemeDir = 'primer-schemes'

    // Scheme name
    scheme = 'SARS-CoV-2'

    // Scheme version
    schemeVersion = 'V4'

The pipeline fails when getting to the alignment step using minimap2.

Versions (on a CentOS 7 VM): nextflow version 21.04.1.5556 singularity version 3.5.3 Python 3.6.13

Full error log:

  [M::main] CMD: minimap2 -a -x map-ont -t 1 primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.reference.fasta Plate0025T-20210728-1_barcode43.fastq
  [M::main] Real time: 0.057 sec; CPU: 0.034 sec; Peak RSS: 0.004 GB
  [post-run summary] total reads: 82, unparseable: 0, qc fail: 0, could not calibrate: 0, no alignment: 0, bad fast5: 41
  [post-run summary] total reads: 182, unparseable: 0, qc fail: 0, could not calibrate: 0, no alignment: 0, bad fast5: 91
  Traceback (most recent call last):
    File "/opt/conda/envs/artic/bin/artic_plot_amplicon_depth", line 10, in <module>
      sys.exit(main())
    File "/opt/conda/envs/artic/lib/python3.6/site-packages/artic/plot_amplicon_depth.py", line 143, in main
      go(args)
    File "/opt/conda/envs/artic/lib/python3.6/site-packages/artic/plot_amplicon_depth.py", line 76, in go
      rg)
  AssertionError: error: readgroup not found in provided primer scheme (1)
  Running: nanopolish index -s sequencing_summary_FAP85311_7c228425.txt -d fast5_pass Plate0025T-20210728-1_barcode43.fastq
  Running: minimap2 -a -x map-ont -t 1 primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.reference.fasta Plate0025T-20210728-1_barcode43.fastq | samtools view -bS -F 4 - | samtools sort -o Plate0025T-20210728-1_barcode43.sorted.bam -
  Running: samtools index Plate0025T-20210728-1_barcode43.sorted.bam
  Running: align_trim --start --normalise 500 primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.scheme.bed --report Plate0025T-20210728-1_barcode43.alignreport.txt < Plate0025T-20210728-1_barcode43.sorted.bam 2> Plate0025T-20210728-1_barcode43.alignreport.er | samtools sort -T Plate0025T-20210728-1_barcode43 - -o Plate0025T-20210728-1_barcode43.trimmed.rg.sorted.bam
  Running: align_trim --normalise 500 primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.scheme.bed --remove-incorrect-pairs --report Plate0025T-20210728-1_barcode43.alignreport.txt < Plate0025T-20210728-1_barcode43.sorted.bam 2> Plate0025T-20210728-1_barcode43.alignreport.er | samtools sort -T Plate0025T-20210728-1_barcode43 - -o Plate0025T-20210728-1_barcode43.primertrimmed.rg.sorted.bam
  Running: samtools index Plate0025T-20210728-1_barcode43.trimmed.rg.sorted.bam
  Running: samtools index Plate0025T-20210728-1_barcode43.primertrimmed.rg.sorted.bam
  Running: nanopolish variants --min-flanking-sequence 10 -x 1000000 --progress -t 1 --reads Plate0025T-20210728-1_barcode43.fastq -o Plate0025T-20210728-1_barcode43.2.vcf -b Plate0025T-20210728-1_barcode43.trimmed.rg.sorted.bam -g primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.reference.fasta -w "MN908947.3:1-29904" --ploidy 1 -m 0.15 --read-group 2 
  Running: nanopolish variants --min-flanking-sequence 10 -x 1000000 --progress -t 1 --reads Plate0025T-20210728-1_barcode43.fastq -o Plate0025T-20210728-1_barcode43.1.vcf -b Plate0025T-20210728-1_barcode43.trimmed.rg.sorted.bam -g primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.reference.fasta -w "MN908947.3:1-29904" --ploidy 1 -m 0.15 --read-group 1 
  Running: artic_vcf_merge Plate0025T-20210728-1_barcode43 primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.scheme.bed 2:Plate0025T-20210728-1_barcode43.2.vcf 1:Plate0025T-20210728-1_barcode43.1.vcf
  Running: artic_vcf_filter --nanopolish Plate0025T-20210728-1_barcode43.merged.vcf Plate0025T-20210728-1_barcode43.pass.vcf Plate0025T-20210728-1_barcode43.fail.vcf
  Running: artic_make_depth_mask --store-rg-depths primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.reference.fasta Plate0025T-20210728-1_barcode43.primertrimmed.rg.sorted.bam Plate0025T-20210728-1_barcode43.coverage_mask.txt
  Running: artic_plot_amplicon_depth --primerScheme primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.scheme.bed --sampleID Plate0025T-20210728-1_barcode43 --outFilePrefix Plate0025T-20210728-1_barcode43 Plate0025T-20210728-1_barcode43*.depths
  Command failed:artic_plot_amplicon_depth --primerScheme primer-schemes/SARS-CoV-2/V4/SARS-CoV-2.scheme.bed --sampleID Plate0025T-20210728-1_barcode43 --outFilePrefix Plate0025T-20210728-1_barcode43 Plate0025T-20210728-1_barcode43*.depths

Is there a workaround or potential solution to this?

Thank you so much.

csawye01 commented 2 years ago

Has this been solved? I am facing the same problem

esha-joshi commented 2 years ago

Hi @csawye01,

This artic pipeline version this wrapper uses is relatively old. What ended up working for me with the V4 primer scheme was using the artic-networks' 1.3.0-dev pipeline version from this branch, in addition to using the latest version of artic-tools to avoid errors at the vcf-check step for control sequences with no SNVs. I replaced the artic-tools binary in the conda environment with that of the compiled 0.3.1 version since there is no conda release for this version of artic-tools yet (issue here).

Hope that helps.

csawye01 commented 2 years ago

Hi @esha-joshi , thanks so much for responding and letting me know what to do to troubleshoot this. Much appreciated!

csawye01 commented 2 years ago

Hi @esha-joshi I think I have confused myself a bit. Am I using 1.3.0-dev/environment.yml to recreate the artic-ncov2019 conda env or adding this into the connorlab pipeline somewhere? Do you have this on you github?

esha-joshi commented 2 years ago

Hi @csawye01, I would recommend downloading via source and building the conda env using the 1.3.0-dev/environment.yml. Within this conda env, I replaced the artic-tools binary with the most recent version (v0.3.1) to run the pipeline. Hope that helps.