jodyphelan / TBProfiler

Profiling tool for Mycobacterium tuberculosis to detect ressistance and strain type from WGS data
GNU General Public License v3.0
102 stars 42 forks source link

TB-Profiler update_tbdb error #301

Closed idolawoye closed 9 months ago

idolawoye commented 10 months ago

Hi Jody,

I am trying to run tb-profiler on a SLURM HPC but I haven't been able to make any progress with setting up the database.

When I run tb-profiler update_tbdb I get the following:

[10:53:46] ERROR Traceback (most recent call last): utils.py:382 File "/home/olawoyei/.local/bin/tb-profiler", line 559,
in
args.func(args)
File "/home/olawoyei/.local/bin/tb-profiler", line 202,
in main_create_db
pp.create_db(args,extra_files=extra_files)
File
"/home/olawoyei/.local/lib/python3.10/site-packages/patho
genprofiler/db.py", line 621, in create_db
mutation_lookup =
get_snpeff_formated_mutation_list(args.csv,"genome.fasta"
,"genome.gff",json.load(open("variables.json"))["snpEff_d
b"])
File
"/home/olawoyei/.local/lib/python3.10/site-packages/patho
genprofiler/db.py", line 398, in
get_snpeff_formated_mutation_list
mutation_conversion = get_ann(mutations,snpEffDB)
File
"/home/olawoyei/.local/lib/python3.10/site-packages/patho
genprofiler/db.py", line 172, in get_ann
for l in cmd_out(f"snpEff ann {snpEffDB} {uuid}"):
File
"/home/olawoyei/.local/lib/python3.10/site-packages/patho
genprofiler/utils.py", line 389, in cmd_out
run_cmd(cmd)
File
"/home/olawoyei/.local/lib/python3.10/site-packages/patho
genprofiler/utils.py", line 383, in run_cmd
raise ValueError("Command Failed:\n%s\nstderr:\n%s" %
(cmd,result.stderr.decode()))
ValueError: Command Failed:
/bin/bash -c set -o pipefail; snpEff ann
Mycobacterium_tuberculosis_h37rv
f0adbfc3-a58e-40f7-8c26-aacbadd63839 >
9cd7517b-1b6b-48ff-8989-e95c7a0a06d7
stderr:
Picked up JAVA_TOOL_OPTIONS: -Xmx2g
00:00:00 ERROR while connecting to
https://snpeff.blob.core.windows.net/databases/v5_0/snpEf
f_v5_0_Mycobacterium_tuberculosis_h37rv.zip
java.lang.RuntimeException:
java.io.FileNotFoundException:
/tmp/snpEff_v5_0_Mycobacterium_tuberculosis_h37rv.zip
(Permission denied)
at
org.snpeff.util.Download.download(Download.java:178)
at
org.snpeff.snpEffect.commandLine.SnpEffCmdDownload.downlo
adAndInstall(SnpEffCmdDownload.java:32)
at
org.snpeff.snpEffect.commandLine.SnpEffCmdDownload.runDow
nloadGenome(SnpEffCmdDownload.java:86)
at
org.snpeff.snpEffect.commandLine.SnpEffCmdDownload.run(Sn
pEffCmdDownload.java:72)
at org.snpeff.SnpEff.run(SnpEff.java:1233)
at org.snpeff.SnpEff.loadDb(SnpEff.java:520)
at
org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffC
mdEff.java:939)
at
org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffC
mdEff.java:922)
at org.snpeff.SnpEff.run(SnpEff.java:1194)
at org.snpeff.SnpEff.main(SnpEff.java:167)
Caused by: java.io.FileNotFoundException:
/tmp/snpEff_v5_0_Mycobacterium_tuberculosis_h37rv.zip
(Permission denied)
at
java.base/java.io.FileOutputStream.open0(Native Method)
at
java.base/java.io.FileOutputStream.open(FileOutputStream.
java:292)
at
java.base/java.io.FileOutputStream.(FileOutputStrea
m.java:235)
at
java.base/java.io.FileOutputStream.(FileOutputStrea
m.java:124)
at
org.snpeff.util.Download.download(Download.java:153)
... 9 more
java.lang.RuntimeException: Genome download failed!
at org.snpeff.SnpEff.loadDb(SnpEff.java:521)
at
org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffC
mdEff.java:939)
at
org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffC
mdEff.java:922)
at org.snpeff.SnpEff.run(SnpEff.java:1194)
at org.snpeff.SnpEff.main(SnpEff.java:167)

                Cleaning up after failed run                                          

Traceback (most recent call last): File "/home/olawoyei/.local/bin/tb-profiler", line 559, in args.func(args) File "/home/olawoyei/.local/bin/tb-profiler", line 180, in main_update_tbdb pp.run_cmd(f"tb-profiler create_db --prefix {args.prefix} {tmp} --load") File "/home/olawoyei/.local/lib/python3.10/site-packages/pathogenprofiler/utils.py", line 383, in run_cmd raise ValueError("Command Failed:\n%s\nstderr:\n%s" % (cmd,result.stderr.decode())) ValueError: Command Failed: /bin/bash -c set -o pipefail; tb-profiler create_db --prefix tbdb --load stderr: Traceback (most recent call last): File "/home/olawoyei/.local/bin/tb-profiler", line 559, in args.func(args) File "/home/olawoyei/.local/bin/tb-profiler", line 202, in main_create_db pp.create_db(args,extra_files=extra_files) File "/home/olawoyei/.local/lib/python3.10/site-packages/pathogenprofiler/db.py", line 621, in create_db mutation_lookup = get_snpeff_formated_mutation_list(args.csv,"genome.fasta","genome.gff",json.load(open("variables.json"))["snpEff_db"]) File "/home/olawoyei/.local/lib/python3.10/site-packages/pathogenprofiler/db.py", line 398, in get_snpeff_formated_mutation_list mutation_conversion = get_ann(mutations,snpEffDB) File "/home/olawoyei/.local/lib/python3.10/site-packages/pathogenprofiler/db.py", line 172, in get_ann for l in cmd_out(f"snpEff ann {snpEffDB} {uuid}"): File "/home/olawoyei/.local/lib/python3.10/site-packages/pathogenprofiler/utils.py", line 389, in cmd_out run_cmd(cmd) File "/home/olawoyei/.local/lib/python3.10/site-packages/pathogenprofiler/utils.py", line 383, in run_cmd raise ValueError("Command Failed:\n%s\nstderr:\n%s" % (cmd,result.stderr.decode())) ValueError: Command Failed: /bin/bash -c set -o pipefail; snpEff ann Mycobacterium_tuberculosis_h37rv f0adbfc3-a58e-40f7-8c26-aacbadd63839 > 9cd7517b-1b6b-48ff-8989-e95c7a0a06d7 stderr: Picked up JAVA_TOOL_OPTIONS: -Xmx2g 00:00:00 ERROR while connecting to https://snpeff.blob.core.windows.net/databases/v5_0/snpEff_v5_0_Mycobacterium_tuberculosis_h37rv.zip java.lang.RuntimeException: java.io.FileNotFoundException: /tmp/snpEff_v5_0_Mycobacterium_tuberculosis_h37rv.zip (Permission denied) at org.snpeff.util.Download.download(Download.java:178) at org.snpeff.snpEffect.commandLine.SnpEffCmdDownload.downloadAndInstall(SnpEffCmdDownload.java:32) at org.snpeff.snpEffect.commandLine.SnpEffCmdDownload.runDownloadGenome(SnpEffCmdDownload.java:86) at org.snpeff.snpEffect.commandLine.SnpEffCmdDownload.run(SnpEffCmdDownload.java:72) at org.snpeff.SnpEff.run(SnpEff.java:1233) at org.snpeff.SnpEff.loadDb(SnpEff.java:520) at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:939) at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:922) at org.snpeff.SnpEff.run(SnpEff.java:1194) at org.snpeff.SnpEff.main(SnpEff.java:167) Caused by: java.io.FileNotFoundException: /tmp/snpEff_v5_0_Mycobacterium_tuberculosis_h37rv.zip (Permission denied) at java.base/java.io.FileOutputStream.open0(Native Method) at java.base/java.io.FileOutputStream.open(FileOutputStream.java:292) at java.base/java.io.FileOutputStream.(FileOutputStream.java:235) at java.base/java.io.FileOutputStream.(FileOutputStream.java:124) at org.snpeff.util.Download.download(Download.java:153) ... 9 more java.lang.RuntimeException: Genome download failed! at org.snpeff.SnpEff.loadDb(SnpEff.java:521) at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:939) at org.snpeff.snpEffect.commandLine.SnpEffCmdEff.run(SnpEffCmdEff.java:922) at org.snpeff.SnpEff.run(SnpEff.java:1194) at org.snpeff.SnpEff.main(SnpEff.java:167)

Cleaning up after failed run

Cleaning up after failed run ERROR tb-profiler:58

                ################################# ERROR                               
                #######################################                               

                This run has failed. Please check all arguments and                   
                make sure all input files                                             
                exist. If no solution is found, please open up an issue               
                at                                                                    
                https://github.com/jodyphelan/TBProfiler/issues/new and               
                paste or attach the                                                   
                contents of the error log (tbdb.errlog.txt)                           

                #######################################################               
                ########################                  

I have tried running update_db with --temp into a directory that I have write permissions but it still produces the same error

jodyphelan commented 9 months ago

Hi @idolawoye

/tmp/snpEff_v5_0_Mycobacterium_tuberculosis_h37rv.zip (Permission denied)

Apologies with the delay in gettin back to you. It looks permissions error. When running update_tbdb the pipeline will download a few files. If you don't have access to certain directories then it will fail. Did you install with conda?

idolawoye commented 9 months ago

Hi @jodyphelan. I installed with pip not conda because conda isn't allowed on our HPC. I figured it was a permission issue with writing the tbdb files. Is there a way to redirect this to a directory that I have write access to ?

jodyphelan commented 9 months ago

Which directories do you have access to? In theory you can create the database locally and then upload all the files you need to the server. Then you can use --external_db and point to the uploaded database

idolawoye commented 9 months ago

Yes, I have tried that but I ran into this error when I did:

tb-profiler profile -1 mtb/fastq/DRR034354_1.fastq.gz -2 mtb/fastq/DRR034354_2.fastq.gz --external_db ids_tbdb/ Traceback (most recent call last): File "/project/6083771/olawoyei/pipelineENV/bin/tb-profiler", line 559, in args.func(args) File "/project/6083771/olawoyei/pipelineENV/bin/tb-profiler", line 84, in main_profile args.conf['variant_filters'] = pp.get_variant_filters(args) TypeError: 'NoneType' object does not support item assignment Cleaning up after failed run Exception ignored in atexit callback: <function cleanup at 0x7fa29f0cc4c0> Traceback (most recent call last): File "/project/6083771/olawoyei/pipelineENV/bin/tb-profiler", line 38, in cleanup del args.conf['json_db'] TypeError: 'NoneType' object does not support item deletion

idolawoye commented 9 months ago

I have write access to directories in my workspace so I also tried using the --temp and --dir option to direct the directories to my workspace but it still didn't work

jodyphelan commented 9 months ago

tb-profiler profile -1 mtb/fastq/DRR034354_1.fastq.gz -2 mtb/fastq/DRR034354_2.fastq.gz --external_db ids_tbdb/

With your command above you are just passing the folder which contains the files, however it needs to include the library prefix. For example if you have the following directory structure:

├── ids_tbdb
│   ├── tbdb.barcode.bed
│   ├── tbdb.bed
│   ├── tbdb.dr.json
│   ├── tbdb.fasta
│   ├── tbdb.gff
│   ├── tbdb.variables.json
│   └── tbdb.version.json

Then the command needs to be:

tb-profiler profile -1 mtb/fastq/DRR034354_1.fastq.gz -2 mtb/fastq/DRR034354_2.fastq.gz --external_db ids_tbdb/tbdb
idolawoye commented 9 months ago

Thanks, that worked!