StaPH-B / docker-builds

:package: :whale: Dockerfiles and documentation on tools for public health bioinformatics
GNU General Public License v3.0
187 stars 119 forks source link

adds viridian 1.3.0 #1077

Closed Kincekara closed 2 weeks ago

Kincekara commented 2 weeks ago

Thank god our recipe is still working! No major changes...

diff viridian/1.2.2/ viridian/1.3.0/
diff viridian/1.2.2/Dockerfile viridian/1.3.0/Dockerfile
1,2c1,2
< ARG VIRIDIAN_VER="1.2.2"
< ARG SAMTOOLS_VER="1.20"
---
> ARG VIRIDIAN_VER="1.3.0"
> ARG SAMTOOLS_VER="1.21"
diff viridian/1.2.2/README.md viridian/1.3.0/README.md
8,10c8,10
< - samtools: 1.20
< - bcftools: 1.20
< - htslib: 1.20
---
> - samtools: 1.21
> - bcftools: 1.21
> - htslib: 1.21

Pull Request (PR) checklist:

erinyoung commented 2 weeks ago

The tests worked for this:

#27 [test 3/4] RUN viridian run_one_sample --run_accession SRR29437696 --outdir OUT &&   wc -l OUT/consensus.fa.gz OUT/log.json.gz OUT/qc.tsv.gz &&   head OUT/variants.vcf
#27 0.571 [2024-10-03T18:38:56+0000 viridian INFO] =================== 1/10 Start pipeline ====================
#27 0.571 [2024-10-03T18:38:56+0000 viridian INFO] Start running viridian, output dir: /test/OUT
#27 0.571 [2024-10-03T18:38:56+0000 viridian INFO] Putting temporary files in /tmp/viridian.2xb5656p
#27 0.572 [2024-10-03T18:38:56+0000 viridian INFO] Getting metadata from ENA for run SRR29437696
#27 2.592 [2024-10-03T18:38:58+0000 viridian INFO] Metadata: {'run_accession': 'SRR29437696', 'instrument_platform': 'ILLUMINA', 'fastq_ftp': 'ftp.sra.ebi.ac.uk/vol1/fastq/SRR294/096/SRR29437696/SRR29437696_1.fastq.gz;ftp.sra.ebi.ac.uk/vol1/fastq/SRR294/096/SRR29437696/SRR29437696_2.fastq.gz'}
#27 2.592 [2024-10-03T18:38:58+0000 viridian INFO] ENA run SRR29437696 has instrument_platform ILLUMINA
#27 2.592 [2024-10-03T18:38:58+0000 viridian INFO] Run command: enaDataGet -f fastq SRR29437696
#28 74.82 [2024-10-03T18:41:41 cylon INFO] Processed 71 of 96 amplicons
#28 74.82 [2024-10-03T18:41:42 cylon INFO] Processed 81 of 96 amplicons
#28 74.82 [2024-10-03T18:41:42 cylon INFO] Processed 91 of 96 amplicons
#28 74.82 [2024-10-03T18:41:42 cylon INFO] Processing the final five
#28 74.82 [2024-10-03T18:41:43 cylon INFO] Finished polishing each amplicon
#28 74.82 [2024-10-03T18:41:43 cylon INFO] Start making consensus from polished amplicons
#28 74.82 [2024-10-03T18:41:43 cylon INFO] Finished making consensus sequence.
#28 74.82 [2024-10-03T18:41:43+0000 viridian INFO] Return code 0 from: cylon  assemble --reads_per_amp_dir /tmp/viridian.jcbii04p/sample_reads/cylon illumina /usr/local/lib/python3.10/dist-packages/viridian/amplicon_scheme_data/MN908947.fasta /tmp/viridian.jcbii04p/sample_reads/cylon.json /tmp/viridian.jcbii04p/cylon
#28 74.83 [2024-10-03T18:41:43+0000 viridian INFO] Finished making initial consensus sequence
#28 74.99 [2024-10-03T18:41:43+0000 viridian INFO] ===== 7/10 Initial VCF and MSA of consensus/reference ======
#28 74.99 [2024-10-03T18:41:43+0000 viridian INFO] Making initial VCF file and multiple sequence alignment
#28 74.99 [2024-10-03T18:41:43+0000 viridian INFO] Run command: varifier make_truth_vcf  --sanitise_truth_gaps  --global_align --global_align_min_coord 48 --global_align_max_coord 29873 /tmp/viridian.jcbii04p/cylon/consensus.final_assembly.fa /usr/local/lib/python3.10/dist-packages/viridian/amplicon_scheme_data/MN908947.fasta /tmp/viridian.jcbii04p/varifier
#28 75.33 [2024-10-03T18:41:43+0000 viridian INFO] stderr:
#28 75.33 /usr/local/lib/python3.10/dist-packages/Bio/pairwise2.py:278: BiopythonDeprecationWarning: Bio.pairwise2 has been deprecated, and we intend to remove it in a future release of Biopython. As an alternative, please consider using Bio.Align.PairwiseAligner as a replacement, and contact the Biopython developers if you still need the Bio.pairwise2 module.
#28 75.33   warnings.warn(
#28 75.33 2024-10-03 18:41:43,878 [INFO]: Made VCF file of variants '/tmp/viridian.jcbii04p/varifier/01.merged.vcf' by globally aligning ref/truth sequences
#28 75.33 2024-10-03 18:41:43,878 [INFO]: Probe mapping to remove incorrect calls
#28 75.33 2024-10-03 18:41:43,911 [INFO]: Made filtered VCF file /tmp/viridian.jcbii04p/varifier/03.probe_filtered.vcf
#28 75.33 2024-10-03 18:41:43,912 [INFO]: Finished making truth VCF file /tmp/viridian.jcbii04p/varifier/04.truth.vcf
#28 75.33 [2024-10-03T18:41:43+0000 viridian INFO] Return code 0 from: varifier make_truth_vcf  --sanitise_truth_gaps  --global_align --global_align_min_coord 48 --global_align_max_coord 29873 /tmp/viridian.jcbii04p/cylon/consensus.final_assembly.fa /usr/local/lib/python3.10/dist-packages/viridian/amplicon_scheme_data/MN908947.fasta /tmp/viridian.jcbii04p/varifier
#28 75.33 [2024-10-03T18:41:43+0000 viridian INFO] Finished initial VCF file and multiple sequence alignment
#28 75.52 [2024-10-03T18:41:44+0000 viridian INFO] ======== 8/10 QC using reads vs consensus sequence =========
#28 75.52 [2024-10-03T18:41:44+0000 viridian INFO] Start QC using reads mapped to consensus
#28 77.03 [2024-10-03T18:41:45+0000 viridian INFO] (pileup) Got pileup for 10/96 amplicons
#28 78.36 [2024-10-03T18:41:46+0000 viridian INFO] (pileup) Got pileup for 20/96 amplicons
#28 79.76 [2024-10-03T18:41:48+0000 viridian INFO] (pileup) Got pileup for 30/96 amplicons
#28 81.07 [2024-10-03T18:41:49+0000 viridian INFO] (pileup) Got pileup for 40/96 amplicons
#28 82.49 [2024-10-03T18:41:51+0000 viridian INFO] (pileup) Got pileup for 50/96 amplicons
#28 83.97 [2024-10-03T18:41:52+0000 viridian INFO] (pileup) Got pileup for 60/96 amplicons
#28 85.27 [2024-10-03T18:41:53+0000 viridian INFO] (pileup) Got pileup for 70/96 amplicons
#28 86.56 [2024-10-03T18:41:55+0000 viridian INFO] (pileup) Got pileup for 80/96 amplicons
#28 88.16 [2024-10-03T18:41:56+0000 viridian INFO] (pileup) Got pileup for 90/96 amplicons
#28 89.05 [2024-10-03T18:41:57+0000 viridian INFO] (pileup) Got pileup data for all amplicons
#28 89.49 [2024-10-03T18:41:58+0000 viridian INFO] Making per-position stats and masked consensus sequence
#28 91.10 [2024-10-03T18:41:59+0000 viridian INFO] Making VCF file
#28 91.11 [2024-10-03T18:41:59+0000 viridian INFO] Writing masked consensus FASTA
#28 91.12 [2024-10-03T18:41:59+0000 viridian INFO] Finished QC using reads mapped to consensus
#28 91.33 [2024-10-03T18:41:59+0000 viridian INFO] =================== 9/10 Final QC checks ===================
#28 91.33 [2024-10-03T18:41:59+0000 viridian INFO] 0.16% (47/29753) of the consensus sequence is Ns
#28 91.52 [2024-10-03T18:42:00+0000 viridian INFO] ============ 10/10 Tidy up final files and log =============
#28 91.52 [2024-10-03T18:42:00+0000 viridian INFO] Tidying up intermediate files and contents of log
#28 91.52 [2024-10-03T18:42:00+0000 viridian INFO] Run command: rm -rf /tmp/viridian.jcbii04p /test/OUT2/consensus.unmasked.fa*
#28 91.54 [2024-10-03T18:42:00+0000 viridian INFO] Return code 0 from: rm -rf /tmp/viridian.jcbii04p /test/OUT2/consensus.unmasked.fa*
#28 91.54 [2024-10-03T18:42:00+0000 viridian INFO] Run command: rm -rf /test/OUT2/ENA_download
#28 91.56 [2024-10-03T18:42:00+0000 viridian INFO] Return code 0 from: rm -rf /test/OUT2/ENA_download
#28 91.56 [2024-10-03T18:42:00+0000 viridian INFO] Finished tidying
#28 91.59 [2024-10-03T18:42:00+0000 viridian INFO] Writing JSON log file /test/OUT2/log.json.gz
#28 91.62 [2024-10-03T18:42:00+0000 viridian INFO] Finished running viridian. Result: Success
#28 91.71       35 OUT2/consensus.fa.gz
#28 91.71       75 OUT2/log.json.gz
#28 91.71     2136 OUT2/qc.tsv.gz
#28 91.73   328233 OUT2/reference_mapped.bam
#28 91.73   330479 total
#28 91.73 ##fileformat=VCFv4.2
#28 91.73 ##contig=<ID=MN908947.3,length=29903>
#28 91.73 ##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#28 91.73 ##INFO=<ID=AMP,Number=.,Type=String,Description="List of amplicon(s) overlapping the REF allele">
#28 91.73 ##INFO=<ID=PRIMER,Number=.,Type=String,Description="List of primers(s) overlapping the REF allele">
#28 91.73 ##INFO=<ID=CONS_POS,Number=1,Type=Integer,Description="Start position of ALT allele in consensus sequence">
#28 91.73 ##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Total clean read depth">
#28 91.73 ##FORMAT=<ID=CDP,Number=1,Type=Integer,Description="Total clean read depth that agrees with the ALT allele (ie the consensus sequence)">
#28 91.73 ##FILTER=<ID=PASS,Description="Variant passed all filters">
#28 91.73 ##FILTER=<ID=ASY,Description="Assembly (before QC) put an N at this position">
#28 DONE 91.7s
erinyoung commented 2 weeks ago

You can check the status of the deploy here : https://github.com/StaPH-B/docker-builds/actions/runs/11183380711