bokulich-lab / q2-moshpit

MOdular SHotgun metagenome Pipelines with Integrated provenance Tracking
BSD 3-Clause "New" or "Revised" License
3 stars 12 forks source link

BUG: `evaluate-busco` fails with a TypeError #130

Closed misialq closed 9 months ago

misialq commented 9 months ago

Describe the bug When I try to use the evaluate-busco action I get a TypeError right after prodigal runs on the first sample (see below).

To Reproduce Steps to reproduce the behavior:

  1. Create the most recent moshpit environment: mamba env create -n q2-shotgun-2023.11 --file https://raw.githubusercontent.com/qiime2/distributions/dev/2024.2/shotgun/passed/qiime2-shotgun-macos-latest-conda.yml
  2. Activate: conda activate q2-shotgun-2023.11
  3. Download the input MAGS from here
  4. Run the action: qiime moshpit evaluate-busco --i-bins ./mags.qza --verbose --p-lineage-dataset bacteria_odb10 --p-cpu 6 --o-visualization ./mags.qzv
  5. See error:
    
    Running external command line application(s). This may print messages to stdout and/or stderr.
    The command(s) being run are below. These commands cannot be manually re-run as they will depend on temporary files that no longer exist.

Command: busco --mode genome --lineage_dataset bacteria_odb10 --cpu 6 --contig_break 10 --evalue 0.001 --limit 3 --in /var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/qiime2/mziemski/data/77bc27a7-11b7-4470-9045-a52a1f929750/data/sample1 --out_path /var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/tmpebopc_t4/busco_output -o sample1

2024-02-02 09:56:31 INFO: Start a BUSCO v5.6.1 analysis, current time: 02/02/2024 09:56:31 2024-02-02 09:56:31 INFO: Configuring BUSCO with local environment 2024-02-02 09:56:31 INFO: Running genome mode 2024-02-02 09:56:31 INFO: Downloading information on latest versions of BUSCO data... 2024-02-02 09:56:34 INFO: Running in batch mode. 2 input files found in /var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/qiime2/mziemski/data/77bc27a7-11b7-4470-9045-a52a1f929750/data/sample1 2024-02-02 09:56:34 INFO: Input file is /var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/qiime2/mziemski/data/77bc27a7-11b7-4470-9045-a52a1f929750/data/sample1/ab23d75d-547d-455a-8b51-16b46ddf7496.fasta 2024-02-02 09:56:34 WARNING: Option evalue was provided but is not used in the selected run mode, prok_genome_prod 2024-02-02 09:56:34 WARNING: Option limit was provided but is not used in the selected run mode, prok_genome_prod 2024-02-02 09:56:37 INFO: Running BUSCO using lineage dataset bacteria_odb10 (prokaryota, 2024-01-08) 2024-02-02 09:56:37 INFO: Running 1 job(s) on bbtools, starting at 02/02/2024 09:56:37 2024-02-02 09:56:39 INFO: [bbtools] 1 of 1 task(s) completed 2024-02-02 09:56:39 INFO: Run Prodigal on input to predict and extract genes 2024-02-02 09:56:39 INFO: Running Prodigal with genetic code 11 in single mode 2024-02-02 09:56:39 INFO: Running 1 job(s) on prodigal, starting at 02/02/2024 09:56:39 2024-02-02 09:56:45 INFO: [prodigal] 1 of 1 task(s) completed 2024-02-02 09:56:45 INFO: Running Prodigal with genetic code 4 in single mode 2024-02-02 09:56:45 INFO: Running 1 job(s) on prodigal, starting at 02/02/2024 09:56:45 2024-02-02 09:56:52 INFO: [prodigal] 1 of 1 task(s) completed 2024-02-02 09:56:52 CRITICAL: Unhandled exception occurred: Traceback (most recent call last): File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/busco/BuscoRunner.py", line 188, in run self.runner.run_analysis() File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/busco/BuscoRunner.py", line 564, in run_analysis self.analysis.run_analysis() File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/busco/analysis/GenomeAnalysis.py", line 113, in run_analysis self._run_prodigal() File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/busco/BuscoLogger.py", line 62, in wrapped_func self.retval = func(*args, **kwargs) File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/busco/analysis/GenomeAnalysis.py", line 147, in _run_prodigal self.prodigal_runner.run() File "...miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/busco/busco_tools/prodigal.py", line 187, in run self.get_gene_details() File "...miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/busco/busco_tools/prodigal.py", line 290, in get_gene_details self.record_gene_details( File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/busco/busco_tools/prodigal.py", line 317, in record_gene_details aa_seq = SeqRecord(Seq(details["aa_seq"]), id=gene_id, description="") File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/Bio/Seq.py", line 95, in init raise TypeError( TypeError: The sequence data given to a Seq object should be a string (not another Seq object etc)

2024-02-02 09:56:52 ERROR: The sequence data given to a Seq object should be a string (not another Seq object etc) 2024-02-02 09:56:52 ERROR: BUSCO analysis failed! 2024-02-02 09:56:52 ERROR: Check the logs, read the user guide (https://busco.ezlab.org/busco_userguide.html), and check the BUSCO issue board on https://gitlab.com/ezlab/busco/issues

Traceback (most recent call last): File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/q2cli/commands.py", line 520, in call results = self._execute_action( File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/q2cli/commands.py", line 581, in _execute_action results = action(arguments) File "", line 2, in evaluate_busco File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 342, in bound_callable outputs = self._callableexecutor( File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/qiime2/sdk/action.py", line 615, in _callableexecutor ret_val = self._callable(output_dir=temp_dir, view_args) File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/q2_moshpit/busco/busco.py", line 74, in evaluate_busco path_to_run_summaries = q2_moshpit.busco.utils._run_busco( File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/q2_moshpit/busco/utils.py", line 258, in _run_busco run_command(cmd) File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/site-packages/q2_moshpit/_utils.py", line 31, in run_command subprocess.run(cmd, check=True, **kwargs) File ".../miniconda3/envs/q2-shotgun-2023.11/lib/python3.8/subprocess.py", line 516, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '['busco', '--mode', 'genome', '--lineage_dataset', 'bacteria_odb10', '--cpu', '6', '--contig_break', '10', '--evalue', '0.001', '--limit', '3', '--in', '/var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/qiime2/mziemski/data/77bc27a7-11b7-4470-9045-a52a1f929750/data/sample1', '--out_path', '/var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/tmpebopc_t4/busco_output', '-o', 'sample1']' returned non-zero exit status 1.

Plugin error from moshpit:

Command '['busco', '--mode', 'genome', '--lineage_dataset', 'bacteria_odb10', '--cpu', '6', '--contig_break', '10', '--evalue', '0.001', '--limit', '3', '--in', '/var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/qiime2/mziemski/data/77bc27a7-11b7-4470-9045-a52a1f929750/data/sample1', '--out_path', '/var/folders/7f/7nw_x13n5q965rss_qz6061m0000gq/T/tmpebopc_t4/busco_output', '-o', 'sample1']' returned non-zero exit status 1.

See above for debug info.



**Expected behavior**
The actions completes successfully.

**Please complete the following information:**
 - OS: macOS or Ubuntu (happens on both)
 - QIIME 2 version: 2023.11
Sann5 commented 9 months ago

busco==5.6.1 requires biopython>=1.79. The env described above installs busco==5.6.1 and biopython==1.78. pip install biopython==1.79 or pip install biopython --upgrade (which installes version 1.83) solved it for me.

The busco installation should constrain the biopython version but it does not since they were not aware of this requirement. They have an open issue about it. They said they will constrain the version in future releases.

As a side note, busco==5.5.0 still worked with biopython==1.78 and 1.76.