rhysnewell / aviary

A hybrid assembly and MAG recovery pipeline (and more!)
GNU General Public License v3.0
84 stars 12 forks source link

Bio module not found in script #217

Open wwood opened 2 weeks ago

wwood commented 2 weeks ago
$ aviary recover -a assembly_checkpointing_megahit_output/megahit/assembly_output/final.contigs.fa --binning-only --skip-binners rosella vamb -c ../data/ERR34*
...
[Thu Oct 31 06:44:25 2024]
localrule finalise_stats:                                                                                                                                                                                            input: bins/checkm.out, bins/checkm2_output/quality_report.tsv                                                                                                                                                   output: bins/bin_info.tsv, bins/checkm_minimal.tsv
    jobid: 1
    reason: Missing output files: bins/bin_info.tsv
    resources: tmpdir=/data1/tmp                                                                                                                                                                                                                                                                                                                                                                                                  Traceback (most recent call last):                                                                                                                                                                                 File "/mnt/hpccs01/work/microbiome/msingle/mess/188_phosphate_addition_incubation/all_in_assembly/.snakemake/scripts/tmp3zrqvz86.finalise_stats.py", line 6, in <module>
    from Bio import SeqIO                                                                                                                                                                                        ModuleNotFoundError: No module named 'Bio'

I'm confused by this - biopython is installed, and importing Bio from a python repl works fine. Any idea why this script isn't finding it?

(aviary-dev)cl5n006:20241031:~/m/msingle/mess/188_phosphate_addition_incubation/all_in_assembly$ python3
Python 3.11.0 | packaged by conda-forge | (main, Jan 14 2023, 12:27:40) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import Bio
>>> 
rhysnewell commented 2 weeks ago

Hmm, that's a frustrating issue. There may be an issue with python being aliased by another command in your .bashrc? I assume you aren't running this on a cluster, but it may be similar to this situation? https://github.com/snakemake/snakemake/issues/883#issuecomment-839649778

If your conda environment and the environment snakemake is using match, then this shouldn't be happening so there has to be some sort of path weirdness going on