Closed kapsakcj closed 1 month ago
This is not urgent, so don't worry about reviewing soon but I wanted to let y'all know that this PR is ready for review
Looks like the tests worked:
#9 [test 1/2] RUN tblastn -version && stxtyper --version && stxtyper --help && cd /stxtyper && bash test_stxtyper.sh
#9 0.084 tblastn: 2.12.0+
#9 0.084 Package: blast 2.12.0, build Mar 8 2022 16:19:08
#9 0.088 1.0.24
#9 0.090 Determine stx type(s) of a genome, print .tsv-file
#9 0.090
#9 0.090 USAGE: stxtyper [--nucleotide NUC_FASTA] [--name NAME] [--output OUTPUT_FILE] [--blast_bin BLAST_DIR] [--amrfinder] [--print_node] [--nucleotide_output NUC_FASTA_OUT] [--debug] [--log LOG] [--quiet]
#9 0.090 HELP: stxtyper --help or stxtyper -h
#9 0.090 VERSION: stxtyper --version or stxtyper -v
#9 0.090
#9 0.090 NAMED PARAMETERS
#9 0.090 -n NUC_FASTA, --nucleotide NUC_FASTA
#9 0.090 Input nucleotide FASTA file (can be gzipped)
#9 0.090 --name NAME
#9 0.090 Text to be added as the first column "name" to all rows of the report, for example it can be an assembly name
#9 0.090 -o OUTPUT_FILE, --output OUTPUT_FILE
#9 0.090 Write output to OUTPUT_FILE instead of STDOUT
#9 0.090 --blast_bin BLAST_DIR
#9 0.090 Directory for BLAST. Deafult: $BLAST_BIN
#9 0.090 --amrfinder
#9 0.090 Print output in the nucleotide AMRFinderPlus format
#9 0.090 --print_node
#9 0.090 Print AMRFinderPlus hierarchy node
#9 0.090 --nucleotide_output NUC_FASTA_OUT
#9 0.090 Output nucleotide FASTA file of reported nucleotide sequences
#9 0.090 --debug
#9 0.090 Integrity checks
#9 0.090 --log LOG
#9 0.090 Error log file, appended, opened on application start
#9 0.090 -q, --quiet
#9 0.090 Suppress messages to STDERR
#9 0.090
#9 0.090 Temporary directory used is $TMPDIR or "/tmp"
#9 0.092 Testing ./stxtyper
#9 0.092 To test stxtyper in your path run 'test_stxtyper.sh path'
#9 0.098 Running: ./stxtyper --nucleotide_output test/basic.nuc_out.got -n test/basic.fa
#9 0.098 Software directory: '/stxtyper/'
#9 0.098 Version: 1.0.24
#9 11.60 stxtyper took 11 seconds to complete
#9 11.60 ok: test/basic.fa
#9 11.60 ok: --nucleotide_output test/basic.nuc_out.got options worked
#9 11.61 Running: ./stxtyper -n test/synthetics.fa
#9 11.61 Software directory: '/stxtyper/'
#9 11.61 Version: 1.0.24
#9 23.42 stxtyper took 12 seconds to complete
#9 23.43 ok: test/synthetics.fa
#9 23.43 Running: ./stxtyper -n test/virulence_ecoli.fa
#9 23.43 Software directory: '/stxtyper/'
#9 23.43 Version: 1.0.24
#9 42.10 stxtyper took 19 seconds to complete
#9 42.11 ok: test/virulence_ecoli.fa
#9 42.11 Running: ./stxtyper -n test/cases.fa
#9 42.11 Software directory: '/stxtyper/'
#9 42.11 Version: 1.0.24
#9 53.74 stxtyper took 11 seconds to complete
#9 53.75 ok: test/cases.fa
#9 53.75 Running: ./stxtyper --amrfinder -n test/amrfinder_integration.fa
#9 53.75 Software directory: '/stxtyper/'
#9 53.75 Version: 1.0.24
#9 65.34 stxtyper took 12 seconds to complete
#9 65.34 ok: test/amrfinder_integration.fa
#9 65.35 Running: ./stxtyper --amrfinder --print_node -n test/amrfinder_integration2.fa
#9 65.35 Software directory: '/stxtyper/'
#9 65.35 Version: 1.0.24
#9 76.99 stxtyper took 11 seconds to complete
#9 76.99 ok: test/amrfinder_integration2.fa
#9 76.99 Done.
#9 76.99
#9 76.99
#9 76.99 ok: all 7 stxtyper tests passed
#9 DONE 77.0s
#10 [test 2/2] RUN echo "downloading test genome & running through stxtyper..." && wget -q https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/012/224/845/GCA_012224845.2_ASM1222484v2/GCA_012224845.2_ASM1222484v2_genomic.fna.gz && stxtyper -n GCA_012224845.2_ASM1222484v2_genomic.fna.gz | tee test-result.tsv && grep 'stx2o' test-result.tsv | grep 'COMPLETE'
#10 0.071 downloading test genome & running through stxtyper...
#10 0.221 Running: stxtyper -n GCA_012224845.2_ASM1222484v2_genomic.fna.gz
#10 0.221 Software directory: '/stxtyper/'
#10 0.221 Version: 1.0.24
#10 24.27 #target_contig stx_type operon identity target_start target_stop target_strand A_reference A_reference_subtype A_identity A_coverage B_reference B_reference_subtype B_identity B_coverage
#10 24.27 CP113091.1 stx2o COMPLETE 100.00 2085533 2086768 + WAK[520](https://github.com/StaPH-B/docker-builds/actions/runs/10775770038/job/29880967537#step:8:526)85.1 stxA2o 100.00 100.00 QZL10983.1 stxB2o 100.00 100.00
#10 24.27 stxtyper took 24 seconds to complete
#10 24.27 CP113091.1 stx2o COMPLETE 100.00 2085533 2086768 + WAK52085.1 stxA2o 100.00 100.00 QZL10983.1 stxB2o 100.00 100.00
#10 DONE 24.3s
I'm going to
Thank you for putting this together! You can check the status of the deploy here : https://github.com/StaPH-B/docker-builds/actions/runs/10910085372
Will fill this out later and mark ready for review after I've finalized thingsThis PR adds stxtyper v1.0.24, the first version release of a new software called
stxtyper
that is used to detect and type shiga toxin genes in bacterial genome assemblies. It also attempts to detect novel shiga toxin subtypes in cases where the detected sequences diverge from the reference sequences.These genes are usually found in E. coli (STEC), but can also be found in Shigella species as well as some other genera more rarely, like Klebsiella. It is developed by NCBI in collaboration with a number of different groups including CDC, FDA, SSI, and others. A publication to fully describe the tool and it's validation is in the works but a software release has been made so the community may test the software further and begin using the tool.
⚠️ I would caution against (clinical) reporting of results from this tool unless a validation has been performed by the user. The tool is performing well in our hands, but of course advise caution if/when reporting results.
Pull Request (PR) checklist:
docker build --tag samtools:1.15test --target test docker-builds/samtools/1.15
)spades/3.12.0/Dockerfile
)shigatyper/2.0.1/test.sh
)spades/3.12.0/README.md
)