knights-lab / SHOGUN

SHallow shOtGUN profiler
GNU Affero General Public License v3.0
54 stars 19 forks source link

Issues with -a option, shogun pipeline command #43

Open rahel31 opened 2 years ago

rahel31 commented 2 years ago

Hi, I would like to analyze my shallow shotgun metagenomics data according to the OGU method, https://journals.asm.org/doi/10.1128/msystems.00167-22 The recommended aligner is Shogun, especially I assume when using shallow data. However I have issues running it: shogun align \ -i combined_seqs.fna \ -a bowtie2 \ -d /srv/beegfs/scratch/users/p/parkr/Classifiers/WoL_Globus_full/databases/shogun \ -o shogun_wol_align

I get error: Traceback (most recent call last): File "/var/spool/slurmd/job13597238/slurm_script", line 8, in sys.exit(cli()) File "/opt/ebsofts/QIIME2/2021.8/lib/python3.8/site-packages/click/core.py", line 829, in call return self.main(args, kwargs) File "/opt/ebsofts/QIIME2/2021.8/lib/python3.8/site-packages/click/core.py", line 782, in main rv = self.invoke(ctx) File "/opt/ebsofts/QIIME2/2021.8/lib/python3.8/site-packages/click/core.py", line 1259, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/opt/ebsofts/QIIME2/2021.8/lib/python3.8/site-packages/click/core.py", line 1066, in invoke return ctx.invoke(self.callback, ctx.params) File "/opt/ebsofts/QIIME2/2021.8/lib/python3.8/site-packages/click/core.py", line 610, in invoke return callback(args, *kwargs) File "/opt/ebsofts/QIIME2/2021.8/lib/python3.8/site-packages/click/decorators.py", line 21, in new_func return f(get_current_context(), args, **kwargs) File "/home/users/p/parkr/.local/lib/python3.8/site-packages/shogun/main.py", line 78, in align aligner_cl.align(input, output) File "/home/users/p/parkr/.local/lib/python3.8/site-packages/shogun/aligners/bowtie2_aligner.py", line 32, in align proc, out, err = bowtie2_align(infile, outfile, self.prefix, File "/home/users/p/parkr/.local/lib/python3.8/site-packages/shogun/wrappers/bowtie2_wrapper.py", line 39, in bowtie2_align return run_command(cmd, shell=shell) File "/home/users/p/parkr/.local/lib/python3.8/site-packages/shogun/utils/_utils.py", line 54, in run_command with subprocess.Popen( File "/opt/ebsofts/QIIME2/2021.8/lib/python3.8/subprocess.py", line 858, in init self._execute_child(args, executable, preexec_fn, close_fds, File "/opt/ebsofts/QIIME2/2021.8/lib/python3.8/subprocess.py", line 1704, in _execute_child raise child_exception_type(errno_num, err_msg, err_filename) FileNotFoundError: [Errno 2] No such file or directory: 'bowtie2'

If I don't provide -a , to use burst, then I get error : FileNotFoundError: [Errno 2] No such file or directory: 'burst15'

Whe I give it the directory bowtie2 from WoL database: I get error: Error: Invalid value for '-a' / '--aligner': invalid choice: /srv/beegfs/scratch/users/p/parkr/Classifiers/WoL_Globus_full/databases/bowtie2. (choose from all, bowtie2, burst, utree)

In the WoL_Globus_full folder there are these folders/files: image

When I copied the content of bowtie2 folder to the shogun folder, it didn't work either. The pre-built WoL databases were downloaded from Globus ( https://biocore.github.io/wol/download )

Could you guide me how to make it work?

Thank you in advance!!

bhillmann commented 2 years ago

This looks like an installation problem, and you don't have the aligners in your path. Can you ensure that you install bowtie2 and that it is accessible on your path? The recommended way with conda requires the environment to be activated.

rahel31 commented 2 years ago

Thank you! Indeed the bowtie2 was not correctly loaded! I got now my taxatable, for the mock community, and shogun identifies from the mock community with 8 bacterial species, more than 300 taxa, and the community member Pseudomonas aeruginosa is not even present (other Pseudumonas species present at very low numbers). I was hoping for it to perform better than Kraken2 / Bracken as it should be adapted to the shallow data.. (sequencing depth - aiming 2mln reads per sample). Any tips how to reduce number of false positive (some are contamination, but not all, when comparing to negative controls) ? Thank you in advance!