Closed ybdong919 closed 1 month ago
I think its snpeff
and not snpEff
I can pull up help options this way:
$ singularity exec -B ./:/data docker://staphb/snpeff:5.1 snpeff
INFO: Using cached SIF image
SnpEff version SnpEff 5.1d (build 2022-06-29 19:08), by Pablo Cingolani
Usage: snpEff [command] [options] [files]
Run 'java -jar snpEff.jar command' for help on each specific command
Available commands:
[eff|ann] : Annotate variants / calculate effects (you can use either 'ann' or 'eff', they mean the same). Default: ann (no command or 'ann').
build : Build a SnpEff database.
buildNextProt : Build a SnpEff for NextProt (using NextProt's XML files).
cds : Compare CDS sequences calculated form a SnpEff database to the one in a FASTA file. Used for checking databases correctness.
closest : Annotate the closest genomic region.
count : Count how many intervals (from a BAM, BED or VCF file) overlap with each genomic interval.
databases : Show currently available databases (from local config file).
download : Download a SnpEff database.
dump : Dump to STDOUT a SnpEff database (mostly used for debugging).
genes2bed : Create a bed file from a genes list.
len : Calculate total genomic length for each marker type.
pdb : Build interaction database (based on PDB data).
protein : Compare protein sequences calculated form a SnpEff database to the one in a FASTA file. Used for checking databases correctness.
seq : Show sequence (from command line) translation.
show : Show a text representation of genes or transcripts coordiantes, DNA sequence and protein sequence.
translocReport : Create a translocations report (from VCF file).
Generic options:
-c , -config : Specify config file
-configOption name=value : Override a config file option
-d , -debug : Debug mode (very verbose).
-dataDir <path> : Override data_dir parameter from config file.
-download : Download a SnpEff database, if not available locally. Default: true
-nodownload : Do not download a SnpEff database, if not available locally.
-h , -help : Show this help and exit
-noLog : Do not report usage statistics to server
-q , -quiet : Quiet mode (do not show any messages or errors)
-v , -verbose : Verbose mode
-version : Show version number and exit
Database options:
-canon : Only use canonical transcripts.
-canonList <file> : Only use canonical transcripts, replace some transcripts using the 'gene_id transcript_id' entries in <file>.
-interaction : Annotate using inteactions (requires interaciton database). Default: true
-interval <file> : Use a custom intervals in TXT/BED/BigBed/VCF/GFF file (you may use this option many times)
-maxTSL <TSL_number> : Only use transcripts having Transcript Support Level lower than <TSL_number>.
-motif : Annotate using motifs (requires Motif database). Default: true
-nextProt : Annotate using NextProt (requires NextProt database).
-noGenome : Do not load any genomic database (e.g. annotate using custom files).
-noExpandIUB : Disable IUB code expansion in input variants
-noInteraction : Disable inteaction annotations
-noMotif : Disable motif annotations.
-noNextProt : Disable NextProt annotations.
-onlyReg : Only use regulation tracks.
-onlyProtein : Only use protein coding transcripts. Default: false
-onlyTr <file.txt> : Only use the transcripts in this file. Format: One transcript ID per line.
-reg <name> : Regulation track to use (this option can be used add several times).
-ss , -spliceSiteSize <int> : Set size for splice sites (donor and acceptor) in bases. Default: 2
-spliceRegionExonSize <int> : Set size for splice site region within exons. Default: 3 bases
-spliceRegionIntronMin <int> : Set minimum number of bases for splice site region within intron. Default: 3 bases
-spliceRegionIntronMax <int> : Set maximum number of bases for splice site region within intron. Default: 8 bases
-strict : Only use 'validated' transcripts (i.e. sequence has been checked). Default: false
-ud , -upDownStreamLen <int> : Set upstream downstream interval length (in bases)
The documentation in the readme uses snpeff
Link to documentation : https://github.com/StaPH-B/docker-builds/tree/master/snpeff/5.2a
The database (Candida_auris) can not be downloaded
[dongyibo@login8 variants]$ singularity exec -B ./:/data docker://staphb/snpeff:5.1 snpeff download Candida_auris INFO: Using cached SIF image 00:00:00 ERROR while connecting to https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_Candida_auris.zip 00:00:00 ERROR while connecting to https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_Candida_auris.zip FATAL ERROR: Failed to download database from [https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_Candida_auris.zip, https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_Candida_auris.zip]
[dongyibo@login8 variants]$ singularity exec docker://staphb/snpeff:5.2a snpEff Candida_auris /blue/bphl-florida/dongyibo/Dev_Candida_PB/output-202402231521 23/variants/bc2085bc2085.vcf > /blue/bphl-florida/dongyibo/Dev_Candida_PB/output-20240223152123/variants/test.ann.vcf INFO: Using cached SIF image FATAL ERROR: Failed to download database from [https://snpeff.blob.core.windows.net/databases/v5_2/snpEff_v5_2_Candida_auris.zip, https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_Candida_auris.zip, https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_Candida_auris.zip] [dongyibo@login8 variants]$
[dongyibo@login8 variants]$ singularity exec docker://staphb/snpeff:5.2a snpeff Candida_auris /blue/bphl-florida/dongyibo/Dev_Candida_PB/output-202402231521 23/variants/bc2085bc2085.vcf > /blue/bphl-florida/dongyibo/Dev_Candida_PB/output-20240223152123/variants/test.ann.vcf INFO: Using cached SIF image FATAL ERROR: Failed to download database from [https://snpeff.blob.core.windows.net/databases/v5_2/snpEff_v5_2_Candida_auris.zip, https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_Candida_auris.zip, https://snpeff.blob.core.windows.net/databases/v5_1/snpEff_v5_1_Candida_auris.zip]
I think it's because the files don't exist?
I ran this outside of a container, just on an ubuntu VM:
$ wget https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_Candida_auris.zip
--2024-02-27 14:39:23-- https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_Candida_auris.zip
Resolving snpeff.blob.core.windows.net (snpeff.blob.core.windows.net)... 52.239.234.228
Connecting to snpeff.blob.core.windows.net (snpeff.blob.core.windows.net)|52.239.234.228|:443... connected.
HTTP request sent, awaiting response... 404 The specified blob does not exist.
2024-02-27 14:39:23 ERROR 404: The specified blob does not exist..
Yep, looks others encounter similar issues when downloading these files: https://github.com/pcingola/SnpEff/issues/503
I don't think this issue is related to our docker image, but will keep this issue open for now
how to use my own databases in this docker image?
I mean that I need build a database by myself. but may I use it in the docker container?
I'm not familiar with using snpeff, but you should be able to build your own database. You will need to mount (docker) or bind (singularity) your directory (like /blue/bphl-florida/dongyibo/snpeff_database
) to a place in the container (like /database
), download the database to that location in the container (/database
in this example). Then, when it's done running, your database will be in your bound directory (/blue/bphl-florida/dongyibo/snpeff_database
in this example).
Something like...
singularity exec --bind /blue/bphl-florida/dongyibo/snpeff_database:/database docker://staphb/snpeff:5.2a snpeff <insert commands that build or download the database> /database
As a feature, containers can't see your system unless you mount them.
What container were you trying to use, and how were you attempting to use it?
[dongyibo@login8 variants]$ singularity exec -B ./:/data docker://staphb/snpeff:5.1 snpEff ann Candida_auris /blue/bphl-florida/dongyibo/Dev_Candida_PB/output-20240223152123/variants/bc2085bc2085.variants_bcftools.vcf > /blue/bphl-florida/dongyibo/Dev_Candida_PB/output-20240223152123/variants/test.chr22.ann.vcf INFO: Converting OCI blobs to SIF format INFO: Starting build... Getting image source signatures Copying blob 66f0a9a81c21 done Copying blob 11b3c9cbca2b done Copying blob d7bfe07ed847 done Copying blob 9b6eb5233646 done Copying blob a90bb4189aae done Copying blob b0684862500b done Copying blob 4f4fb700ef54 done Copying config 1562668ec9 done Writing manifest to image destination Storing signatures 2024/02/27 11:37:40 info unpack layer: sha256:d7bfe07ed8476565a440c2113cc64d7c0409dba8ef761fb3ec019d7e6b5952df 2024/02/27 11:37:40 info unpack layer: sha256:11b3c9cbca2b6982b563fbd6919b029859d28477e927582dfc9b34da1df4e1c7 2024/02/27 11:37:42 warn rootless{usr/lib/x86_64-linux-gnu/gstreamer1.0/gstreamer-1.0/gst-ptp-helper} ignoring (usually) harmless EPERM on setxattr "security.capability" 2024/02/27 11:37:44 info unpack layer: sha256:9b6eb5233646ef9536c0db358cfd8ec8ad73fae0acac800dcc3f86a38d9ce48e 2024/02/27 11:37:46 info unpack layer: sha256:b0684862500b9fa0cf335bee0d99976faa9d601defe67091cc09764758032607 2024/02/27 11:37:48 info unpack layer: sha256:a90bb4189aaed6b4233812724d91d61a2b0ab10d8db2a3544085c317f44ff30f 2024/02/27 11:37:51 info unpack layer: sha256:66f0a9a81c2173031ef6ab6b69bbbe102f519f7fd6e15f7f5bf61580a54786c8 2024/02/27 11:37:51 info unpack layer: sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 INFO: Creating SIF file... FATAL: "snpEff": executable file not found in $PATH
Relevant log output
No response