replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 16 forks source link

CovarPlot fails w/ custom BED #262

Closed hoelzer closed 4 months ago

hoelzer commented 4 months ago

Hi, I have a custom BED file:

primers.bed.position.corrected.porecov.bed.zip

My poreCov command is:

nextflow run replikation/poreCov -r 1.9.3 -profile slurm,singularity -w work --update --fastq_pass fastq_pass --samples samples.csv --output results --cachedir singularity --medaka_model r1041_e82_400bps_sup_v4.2.0 --primerV primers.bed.position.corrected.porecov.bed --artic_normalize 600 --n_threshold 0.2 -resume --minLength 400 --maxLength 1000

The pipeline works only the CoVarPlots are failing.

I checked one work dir.

cat work/a7/5274ce16582ac5500b884bce0ecfe1/.command.log
usage: covarplot.py [-h] [--version] [-v VCF_FILE] [-d1 DEPTH_FILE_1]
                    [-d2 DEPTH_FILE_2] [-b BED] [--show] [-s SAVE] [-l]

Plots for interArtic

optional arguments:
  -h, --help            show this help message and exit
  --version             Prints version
  -v VCF_FILE, --vcf_file VCF_FILE
                        full path to vcf file (default: None)
  -d1 DEPTH_FILE_1, --depth_file_1 DEPTH_FILE_1
                        full path to depth file 1 (default: None)
  -d2 DEPTH_FILE_2, --depth_file_2 DEPTH_FILE_2
                        full path to depth file 2 (default: None)
  -b BED, --bed BED     full path to scheme bed file (default: None)
  --show                Show plot rather than saving it (default: False)
  -s SAVE, --save SAVE  Save path (default: None)
  -l, --log             y-axis log scale (default: False)
error: unrecognized arguments: BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_11.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_21.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_31.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_41.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_51.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_2.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_22.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_32.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_42.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_52.depths

The .command.sh looks like this:

cat work/a7/5274ce16582ac5500b884bce0ecfe1/.command.sh
#!/bin/bash -ue
# clean up bed file: replace first colum with MN908947.3, remove empty lines and sort by 4th column (primer names)
cut -f2- primers.bed.position.corrected.porecov.bed |            sed '/^[[:space:]]*$/d' |            sed -e $'s/^/MN908947.3\t/' |            sort -k4 > nCoV-2019-plot.scheme.bed

covarplot.py -v BM-IMSSC2-123-2023-00229.pass.vcf.gz -d1 BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_1.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_11.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_21.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_31.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_41.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_51.depths -d2 BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_12.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_2.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_22.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_32.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_42.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_52.depths -b nCoV-2019-plot.scheme.bed -s .
mv BM-IMSSC2-123-2023-00229.CoVarPlot.png BM-IMSSC2-123-2023-00229_amplicon_coverage.png
covarplot.py -v BM-IMSSC2-123-2023-00229.pass.vcf.gz -d1 BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_1.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_11.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_21.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_31.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_41.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_51.depths -d2 BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_12.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_2.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_22.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_32.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_42.depths BM-IMSSC2-123-2023-00229.coverage_mask.txt.nCoV-2019_52.depths -b nCoV-2019-plot.scheme.bed -s . --log
mv BM-IMSSC2-123-2023-00229.CoVarPlot.png BM-IMSSC2-123-2023-00229_amplicon_coverage_log.png

I wonder if I need to sort my input BED file also somehow?

hoelzer commented 4 months ago

I just saw that the latest ARTIC V5.3.2 BED file has a bit different structure to what is described for custom BED files in the poreCov README. I am now adjusting my custom BED file to match the ARTIC V5.3.2 format. Maybe that solves it

hoelzer commented 4 months ago

Yeah smt is odd... there should be usually only one

IMSSC2-67-2024-00017.coverage_mask.txt.1.depths

and one

IMSSC2-67-2024-00017.coverage_mask.txt.2.depths

file. And not several like in my case.

hoelzer commented 4 months ago

k I think I know one sec

hoelzer commented 4 months ago

Alright, my bad. I fucked up the pool column. : )

replikation commented 4 months ago

Self solving issues. Thx :D