phac-nml / mob-suite

MOB-suite: Software tools for clustering, reconstruction and typing of plasmids from draft assemblies
Apache License 2.0
111 stars 31 forks source link

MOB-Recon not generating reports - Error in #127

Closed andscoaafc closed 1 year ago

andscoaafc commented 1 year ago

Hi, we're getting an odd error in our Conda instance of MOB-Recon... it generates contig reports and MGE reports when noplasmid is present in the sequence, but does not generate any reports when it does find plasmid sequences. When I look at the error output file the issue seems to be "UnboundLocalError: local variable 'ETE3DBTAXAFILE' referenced before assignment". Any suggestions on how to fix this? There are a bunch of files in a _tmp folder for the sequence, but not in a report form. The details are below:

2023-02-03 15:35:53,426 mob_suite.mob_recon INFO: MOB-recon version 3.1.2 [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py:980] 2023-02-03 15:35:53,427 mob_suite.mob_recon INFO: SUCCESS: Found program blastn at /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/bin/blastn [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:592] 2023-02-03 15:35:53,428 mob_suite.mob_recon INFO: SUCCESS: Found program makeblastdb at /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/bin/makeblastdb [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:592] 2023-02-03 15:35:53,429 mob_suite.mob_recon INFO: SUCCESS: Found program tblastn at /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/bin/tblastn [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:592] 2023-02-03 15:35:53,431 mob_suite.mob_recon INFO: Processing fasta file 2022_GTA_Isolate_Fasta/2022-GTA-0001.fasta [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1007] 2023-02-03 15:35:53,431 mob_suite.mob_recon INFO: Analysis directory 1_Mobsuite/2022-GTA-0001_mobsuite [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1008] 2023-02-03 15:36:03,366 mob_suite.mob_recon INFO: Writing cleaned header input fasta file from 2022_GTA_Isolate_Fasta/2022-GTA-0001.fasta to 1_Mobsuite/2022-GTA-0001_mobsuite/tmp/fixed.input.fasta [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1104] 2023-02-03 15:36:08,793 root INFO: Blasting replicon sequences /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/databases/rep.dna.fas against 1_Mobsuite/2022-GTA-0001_mobsuite/__tmp/fixed.input.fasta [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1150] 2023-02-03 15:36:11,715 root INFO: Filtering replicon blast results 1_Mobsuite/2022-GTA-0001_mobsuite/tmp/replicon_blast_results.txt [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1155] 2023-02-03 15:36:11,788 root INFO: Blasting relaxase sequences /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/databases/mob.proteins.faa against 1_Mobsuite/2022-GTA-0001_mobsuite/tmp/fixed.input.fasta [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1173] 2023-02-03 15:36:28,577 root INFO: Filtering relaxase blast results 1_Mobsuite/2022-GTA-0001_mobsuite/tmp/mob_blast_results.txt [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1178] 2023-02-03 15:36:28,597 root INFO: Blasting MPF sequences /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/databases/mpf.proteins.faa against 1_Mobsuite/2022-GTA-0001_mobsuite/tmp/fixed.input.fasta [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1197] 2023-02-03 15:36:41,825 root INFO: Filtering MPF blast results 1_Mobsuite/2022-GTA-0001_mobsuite/tmp/mpf_blast_results.txt [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1212] 2023-02-03 15:36:41,826 root INFO: Blasting orit sequences /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/databases/orit.fas against 1_Mobsuite/2022-GTA-0001_mobsuite/tmp/fixed.input.fasta [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1221] 2023-02-03 15:36:42,332 root INFO: Filtering orit blast results 1_Mobsuite/2022-GTA-0001_mobsuite/__tmp/orit_blast_results.txt [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1226] 2023-02-03 15:36:42,358 root INFO: Blasting contigs against repetitive sequences db: /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/databases/repetitive.dna.fas [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1243] 2023-02-03 15:36:43,630 root INFO: Filtering repetitive blast results 1_Mobsuite/2022-GTA-0001_mobsuite/tmp/repetitive_blast_results.txt [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1247] 2023-02-03 15:36:43,684 root INFO: Filtering contig: 85_b6a5ac26dac888cf344d287aaddff3b1_circular=false due to repetitive sequence [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1262] 2023-02-03 15:36:43,684 root INFO: Filtering contig: 111_7f43f8b4eed4dd4dbde9be7fbc7573ae_circular=false due to repetitive sequence [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1262] 2023-02-03 15:36:43,685 root INFO: Filtering contig: 100_79d2b5172f3e3ddaac2480adfffa1aa8_circular=false due to repetitive sequence [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1262] 2023-02-03 15:36:43,685 root INFO: Filtering contig: 139_5c4d1111916c11cacd2e86ca710c239f_circular=false due to repetitive sequence [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1262] 2023-02-03 15:36:43,685 root INFO: Filtering contig: 85_b6a5ac26dac888cf344d287aaddff3b1_circular=false due to repetitive sequence [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1262] 2023-02-03 15:36:43,685 root INFO: Filtering contig: 139_5c4d1111916c11cacd2e86ca710c239f_circular=false due to repetitive sequence [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/utils.py:1262] 2023-02-03 15:36:43,687 root INFO: Blasting contigs against reference sequence db: /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/databases/ncbi_plasmid_full_seqs.fas [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1261] 2023-02-03 15:36:57,316 root INFO: Filtering contig blast results: 1_Mobsuite/2022-GTA-0001_mobsuite/__tmp/contig_blast_results.txt [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1266] 2023-02-03 15:36:58,365 root INFO: Assigning contigs to plasmid groups [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1281] 2023-02-03 15:37:14,138 root INFO: Writting contig results to 1_Mobsuite/2022-GTA-0001_mobsuite/contig_report.txt [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1383] Traceback (most recent call last): File "/home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/bin/mob_recon", line 10, in sys.exit(main()) File "/home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py", line 1396, in main build_mobtyper_report(contig_memberships['plasmid'], out_dir, mobtyper_report,contig_seqs, ncbi, lit,ETE3DBTAXAFILE) UnboundLocalError: local variable 'ETE3DBTAXAFILE' referenced before assignment

kbessonov1984 commented 1 year ago

Hello, There seem to be an issue with the ete3 library that is responsible in providing information on the taxonomy and host range predictions for the plasmid. Typically after mob-suite install the databases need to initialized and amongst others the ETE3 module downloads NCBI taxonomy database and performs initialization.

The ETE3DBTAXAFILE is a global variable that should point to the taxa.sqlite file located inside the installation directory which in your case should be at /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/databases/.

If you do not see a taxa.sqlite in that directory, then try to initialize mob-suite again by running mob_init inside that conda environment.

The code that initializes the ete3 library resulting in generation of the taxa.sqlite in the databases directory is located here.

Also mob-suite allows you to point to the databases directory that you can take from any other installation or a docker/singularity image (https://quay.io/repository/biocontainers/mob_suite?tab=tags). This is very useful if you want to take your custom database to other install. All you need to do is to copy that databases directory from a given install and then use the --database_directory parameter.

Hope this helps

andscoaafc commented 1 year ago

Hi Kirill,

Thanks for the reply! I did run mob_init, and it did it's thing successfully. It did print a notification...

2023-02-06 10:04:08,458 mob_suite.utils INFO: Removed residual taxdump.tar.gz as ete3 is not doing proper cleaning job. [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_init.py:243] 2023-02-06 10:04:08,470 mob_suite.utils INFO: MOB init completed successfully [in /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_init.py:257]

I ran the mob_recon again on a sequence and still got the error for the ETE3DBTAXAFILE and no reports generated.

I'll try calling the database directly in mob_recon to see if that fixes it...

Andrew

kbessonov1984 commented 1 year ago

I looked at the code please remove indentation for the L1032 https://github.com/phac-nml/mob-suite/blob/597c0096f8cde1aea960190d697da10a08d53828/mob_suite/mob_recon.py#L1032

This way the ETE3DBTAXAFILE constant will be always initialized. There seem to be an indentation issue. Thank you.

Just edit your /home/AAFC-AAC/scotta/miniconda3/envs/mob-suite/lib/python3.8/site-packages/mob_suite/mob_recon.py file

andscoaafc commented 1 year ago

Kirill,

That did it. It's working properly now, no issues. Thanks a lot!

Andrew

kbessonov1984 commented 1 year ago

Thank you for the update. Will open this issue so that we would not forget to fix it in the next release.

jrober84 commented 1 year ago

Sorry about that problem, the indentation issue has been resolved and v.3.1.3 has the changes