jodyphelan / TBProfiler

Profiling tool for Mycobacterium tuberculosis to detect ressistance and strain type from WGS data
GNU General Public License v3.0
104 stars 43 forks source link

tb-profiler update_tbdb #213

Open BjoerlingTaul opened 2 years ago

BjoerlingTaul commented 2 years ago

taobilin@VirtualBox:~$ tb-profiler update_tbdb

Running command: set -u pipefail; git pull

Running command: set -u pipefail; tb-profiler create_db Traceback (most recent call last): File "/home/taobilin/.local/bin/tb-profiler", line 693, in args.func(args) File "/home/taobilin/.local/bin/tb-profiler", line 271, in main_update_tbdb pp.run_cmd("tb-profiler create_db %s" % tmp) File "/home/taobilin/.local/lib/python3.9/site-packages/pathogenprofiler/utils.py", line 376, in run_cmd raise ValueError("Command Failed:\n%s\nstderr:\n%s" % (cmd,stderr.decode())) ValueError: Command Failed: set -u pipefail; tb-profiler create_db stderr:

Running command: set -u pipefail; snpEff ann Mycobacterium_tuberculosis_h37rv 7d38f5ba-d72b-43d7-a6f8-0c40b28d1001 Traceback (most recent call last): File "/home/taobilin/.local/bin/tb-profiler", line 693, in args.func(args) File "/home/taobilin/.local/bin/tb-profiler", line 277, in main_create_db tbp.create_db(args) File "/home/taobilin/.local/lib/python3.9/site-packages/tbprofiler/db.py", line 532, in create_db mut = mutation_lookup[(row["Gene"],row["Mutation"])] KeyError: ('tlyA', 'c.100delA')

################################# ERROR #######################################

This run has failed. Please check all arguments and make sure all input files exist. If no solution is found, please open up an issue at https://github.com/jodyphelan/TBProfiler/issues/new and paste or attach the contents of the error log (tbdb.errlog)

###############################################################################

Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/home/taobilin/.local/bin/tb-profiler", line 31, in cleanup with open(outfile, "w") as O: UnboundLocalError: local variable 'outfile' referenced before assignment

jodyphelan commented 2 years ago

Hi @BjoerlingTaul ,

This looks like an issue with snpEff. Could you post the output of the following command?

snpEff download Mycobacterium_tuberculosis_h37rv
BjoerlingTaul commented 2 years ago

Hello jody,following is the output of the command with "snpEff download Mycobacterium_tuberculosis_h37rv".

1 2

jodyphelan commented 2 years ago

Not entirely sure what the issue is here. Which version of tb-profiler are you using?

BjoerlingTaul commented 2 years ago

Hi Jodyphelan, this problem is still not well solved. image

jodyphelan commented 2 years ago

Looks like there is an issue connecting github, so this is one issue. Can you try run git clone https://github.com/jodyphelan/tbdb.git and see if that works? Did you install this through conda? If not, that might be a way to solve some of the issues. If conda is not an option, you should then make sure you are using the latest release (available here). Based on the output it looks like this isn't the latest version.

BjoerlingTaul commented 2 years ago

Hi jodyphelan, I have updated TBProfiler to version4.2.0, however something was always wrong with the command "tb-profiler update_tbdb". wrong information was as followings:

(base) taobilin@taobilin-VirtualBox:~$ tb-profiler

usage: tb-profiler {profile,vcf_profile,fasta_profile,lineage,spoligotype,collate,reprofile,reformat,create_db,update_tbdb,batch,version} ...

TBProfiler pipeline

positional arguments:

{profile,vcf_profile,fasta_profile,lineage,spoligotype,collate,reprofile,reformat,create_db,update_tbdb,batch,version}

                    Task to perform

profile             Run whole profiling pipeline

vcf_profile         Run profiling pipeline on VCF file. Warning: this assumes that you have good coverage across the genome

fasta_profile       Run profiling pipeline on Fasta file. Warning: this assumes that this is a good quality assembly which coveres all drug resistance loci

lineage             Profile only lineage

spoligotype         Profile spoligotype (experimental feature)

collate             Collate results form multiple samples together

reprofile           Reprofile previous results using a new library. The new library must have same targets and the old one.

reformat            Reformat json results into text or csv

create_db           Generate the files required to run TBProfiler

update_tbdb         Pull the latest tbdb library and load

batch               Run tb-profiler for several samples

version             Output program version and exit

(base) taobilin@taobilin-VirtualBox:~$ tb-profiler version

TBProfiler version 4.2.0

(base) taobilin@taobilin-VirtualBox:~$ tb-profiler update_tbdb

Running command:

set -u pipefail; git checkout master

Running command:

set -u pipefail; git pull

Running command:

set -u pipefail; tb-profiler create_db --load

Traceback (most recent call last):

File "/home/taobilin/anaconda3/bin/tb-profiler", line 597, in

args.func(args)

File "/home/taobilin/anaconda3/bin/tb-profiler", line 200, in main_update_tbdb

pp.run_cmd("tb-profiler create_db %s --load" % tmp)

File "/home/taobilin/anaconda3/lib/python3.8/site-packages/pathogenprofiler/utils.py", line 397, in run_cmd

raise ValueError("Command Failed:\n%s\nstderr:\n%s" % (cmd,stderr.decode()))

ValueError: Command Failed:

set -u pipefail; tb-profiler create_db --load

stderr:

Converting 580 mutations

Running command:

set -u pipefail; snpEff ann Mycobacterium_tuberculosis_h37rv 5504e2fe-0589-4305-b41f-283664342980

Traceback (most recent call last):

File "/home/taobilin/anaconda3/bin/tb-profiler", line 597, in

args.func(args)

File "/home/taobilin/anaconda3/bin/tb-profiler", line 212, in main_create_db

pp.create_db(args,extra_files=extra_files)

File "/home/taobilin/anaconda3/lib/python3.8/site-packages/pathogenprofiler/db.py", line 560, in create_db

mut = mutation_lookup[(row["Gene"],row["Mutation"])]

KeyError: ('tlyA', 'c.100delA')

Cleaning up after failed run

################################# ERROR #######################################

This run has failed. Please check all arguments and make sure all input files

exist. If no solution is found, please open up an issue at

https://github.com/jodyphelan/TBProfiler/issues/new and paste or attach the

contents of the error log (tbdb.errlog)

###############################################################################

Cleaning up after failed run

################################# ERROR #######################################

This run has failed. Please check all arguments and make sure all input files

exist. If no solution is found, please open up an issue at

https://github.com/jodyphelan/TBProfiler/issues/new and paste or attach the

contents of the error log (7d40eee1-0da7-4685-a2bb-4acd1e4bcf5f.errlog.txt)

###############################################################################

jodyphelan commented 2 years ago

Hi @BjoerlingTaul ,

It is difficult to see what the exact issue is but I believe it is something to do with the variants returned by snpEff. If you want I can take a look at your setup using a remote desktop viewer to see what the issue might be. Drop me an email if that works for you.

BjoerlingTaul commented 2 years ago

Thanks very much for you! It might be the problem of conda environments, TBProfiler does work by following command line:

taobilin@VirtualBox:~$ conda info --envs

base /home/taobilin/miniconda3

TBprofiler2 /home/taobilin/miniconda3/envs/TBprofiler2

snippy /home/taobilin/miniconda3/envs/snippy

taobilin@VirtualBox:~$ conda activate TBprofiler2

(TBprofiler2) taobilin@VirtualBox:~$

ChiehYin commented 2 years ago

Sorry I'm not sure if this is the similar issue, but I also encountered error using update_tbdb.

I tried to download the latest version of TBprofiler. However, I found (from the log) that the tbdb related files were loaded under the below directory which the dates were on 2021/11/23. Therefore, it seemed to be using a former version of tbdb database. (Or is there any way to recognize which version of tbdb database is implementing?) I tried to update tbdb but failed.

_Using gff file: /home/PH_linlab/.conda/envs/tbprofiler/share/tbprofiler/tbdb.gff Using ref file: /home/PHlinlab/.conda/envs/tbprofiler/share/tbprofiler/tbdb.fasta .......

####### Environment ############# Ubuntu TBProfiler version 4.0.3

######### Log ################## $ tb-profiler update_tbdb

Running command: set -u pipefail; git pull

Running command: set -u pipefail; tb-profiler create_db Traceback (most recent call last): File "/home/PH_linlab/.conda/envs/tbprofiler/bin/tb-profiler", line 582, in args.func(args) File "/home/PH_linlab/.conda/envs/tbprofiler/bin/tb-profiler", line 231, in main_update_tbdb pp.run_cmd("tb-profiler create_db %s" % tmp) File "/home/PH_linlab/.conda/envs/tbprofiler/lib/python3.7/site-packages/pathogenprofiler/utils.py", line 255, in run_cmd raise ValueError("Command Failed:\n%s\nstderr:\n%s" % (cmd,stderr.decode())) ValueError: Command Failed: set -u pipefail; tb-profiler create_db stderr: Don't know how to handle this mutation: ethA c.-1058_968del

Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/home/PH_linlab/.conda/envs/tbprofiler/bin/tb-profiler", line 30, in cleanup with open(outfile, "w") as O: UnboundLocalError: local variable 'outfile' referenced before assignment ##################################################################

Thank you!

BjoerlingTaul commented 2 years ago

You may try to delete the old TBProfiler version 4.0.3, and reinstall the newest version 4.3.0. Lastly,check conda environment by enter commend "conda info --envs", and activate it!

jodyphelan commented 2 years ago

Thanks for helping out @BjoerlingTaul! @ChiehYin let us know if your issue persists.

ChiehYin commented 2 years ago

Thank you Bjoerling!

I tried to delete older version of TBProfiler and even create a new environment to re-install (under Centos 7 platform) . But I still have error when running "tb-profiler update_tbdb". (The output is as follow) I'm also curious that if there is any record/note showing the version of tbdb database we used in each TBProfiler run. Many thanks!

############################################################## (tbprofiler4) [linlab@linlab ~]$ tb-profiler version

TBProfiler version 4.0.3

(tbprofiler4) [linlab@linlab ~]$ tb-profiler update_tbdb

Running command: set -u pipefail; git clone https://github.com/jodyphelan/tbdb.git

Running command: set -u pipefail; git pull

Running command: set -u pipefail; tb-profiler create_db Traceback (most recent call last): File "/home/linlab/.conda/envs/tbprofiler4/bin/tb-profiler", line 582, in args.func(args) File "/home/linlab/.conda/envs/tbprofiler4/bin/tb-profiler", line 231, in main_update_tbdb pp.run_cmd("tb-profiler create_db %s" % tmp) File "/home/linlab/.conda/envs/tbprofiler4/lib/python3.7/site-packages/pathogenprofiler/utils.py", line 255, in run_cmd raise ValueError("Command Failed:\n%s\nstderr:\n%s" % (cmd,stderr.decode())) ValueError: Command Failed: set -u pipefail; tb-profiler create_db stderr: Don't know how to handle this mutation: ethA c.-1058_968del

Error in atexit._run_exitfuncs: Traceback (most recent call last): File "/home/linlab/.conda/envs/tbprofiler4/bin/tb-profiler", line 30, in cleanup with open(outfile, "w") as O: UnboundLocalError: local variable 'outfile' referenced before assignment ####################################################################

ryanjameskennedy commented 1 year ago

Hej, my issue is similar in the sense that I'm using the tb-profiler singularity image to install/create the database. To do this, I ran the following:

singularity exec --bind /data tbprofiler.sif tb-profiler update_tbdb -d tbprofiler_db

Which returned the following error:

Traceback (most recent call last):
  File "/usr/local/bin/tb-profiler", line 693, in <module>
    args.func(args)
  File "/usr/local/bin/tb-profiler", line 232, in main_create_db
    pp.create_db(args,extra_files=extra_files)
  File "/usr/local/lib/python3.9/site-packages/pathogenprofiler/db.py", line 734, in create_db
    load_db(variables_file,args.software_name)
  File "/usr/local/lib/python3.9/site-packages/pathogenprofiler/db.py", line 752, in load_db
    shutil.copyfile(source,target)
  File "/usr/local/lib/python3.9/shutil.py", line 266, in copyfile
    with open(dst, 'wb') as fdst:
OSError: [Errno 30] Read-only file system: '/usr/local/share/tbprofiler/tbdb.fasta'
Cleaning up after failed run
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/local/bin/tb-profiler", line 33, in cleanup
    del args.conf['json_db']
AttributeError: 'Namespace' object has no attribute 'conf'

Cleaning up after failed run
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/usr/local/bin/tb-profiler", line 33, in cleanup
    del args.conf['json_db']
AttributeError: 'Namespace' object has no attribute 'conf'

All of the files in /usr/local/share/tbprofiler/ are read-only if you are not root. Is there any way to fix this? Perhaps the ability to provide our own tbdb.fasta file in the arguments?

whottel commented 6 months ago

Running into a similar issue as described by ryanjameskennedy when trying to update to the most recent version of the database using a singularity image. Was there a solution or workaround for this?

ryanjameskennedy commented 6 months ago

Workaround was to clone the TBProfiler repository and pip install into a conda environment. It's just TBProfiler's latest release (and respective singularity image) doesn't contain the changes that fixed this issue.

whottel commented 6 months ago

Thanks for the reply. Are you saying that a future release will include an update that addresses this issue in another way than the workaround you mentioned?

ryanjameskennedy commented 6 months ago

Yeah exactly. The fix has been merged to the master branch, it just hasn't been released yet. Only releases can be added to bioconda-recipes (which is used to build the singularity image on galaxy project depot).

whottel commented 6 months ago

Great, thanks for that clarification.