Open jvfe opened 1 year ago
Hello, MOB-Suite tools do not support compressed inputs at the moment. The mob_recon fails to read expected fasta text file as it gets instead a compressed gzip file. I know that gzipped compressed genomes take significantly less space and support of the compressed inputs is a convenience feature, but is low priority for us. Let's just keep this issue open as a reminder for us and as a feature request.
For now please uncompress inputs before running MOB-Suite tools. If space is a limitation, you can temporary decompress inputs, run MOB-Suite tools and then erase decompressed inputs. You can write simple bash
or python
script or implement it as a NextFlow pipeline.
Hi,
I'm using mob_recon (v3.1.7) on some assemblies and I've noticed that it fails when using a gzip-compressed file and succeeds when using the same file, but decompressed. It looks to be some error related to utf-8 encoding.
Is this expected and is there any way to circumvent this other than decompressing my assemblies? I have over 8000 assemblies so I'm hoping to avoid having to decompress all of them.
Command used
Error log
``` 2023-11-09 16:27:35,689 mob_suite.mob_recon INFO: MOB-recon version 3.1.7 [in /home/jvfe/miniconda3/envs/mobsuite/lib/python3.8/site-packages/mob_suite/mob_recon.py:981] 2023-11-09 16:27:35,689 mob_suite.mob_recon DEBUG: Debug log reporting set on successfully [in /home/jvfe/miniconda3/envs/mobsuite/lib/python3.8/site-packages/mob_suite/mob_recon.py:982] 2023-11-09 16:27:35,689 mob_suite.mob_recon INFO: SUCCESS: Found program blastn at /home/jvfe/miniconda3/envs/mobsuite/bin/blastn [in /home/jvfe/miniconda3/envs/mobsuite/lib/python3.8/site-packages/mob_suite/utils.py:592] 2023-11-09 16:27:35,689 mob_suite.mob_recon INFO: SUCCESS: Found program makeblastdb at /home/jvfe/miniconda3/envs/mobsuite/bin/makeblastdb [in /home/jvfe/miniconda3/envs/mobsuite/lib/python3.8/site-packages/mob_suite/utils.py:592] 2023-11-09 16:27:35,689 mob_suite.mob_recon INFO: SUCCESS: Found program tblastn at /home/jvfe/miniconda3/envs/mobsuite/bin/tblastn [in /home/jvfe/miniconda3/envs/mobsuite/lib/python3.8/site-packages/mob_suite/utils.py:592] 2023-11-09 16:27:35,689 mob_suite.mob_recon INFO: Processing fasta file SAMD00000756.contigs.fa.gz [in /home/jvfe/miniconda3/envs/mobsuite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1008] 2023-11-09 16:27:35,689 mob_suite.mob_recon INFO: Analysis directory SAMD00000756_mob_recon [in /home/jvfe/miniconda3/envs/mobsuite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1009] 2023-11-09 16:27:40,596 mob_suite.mob_recon INFO: Writing cleaned header input fasta file from SAMD00000756.contigs.fa.gz to SAMD00000756_mob_recon/__tmp/fixed.input.fasta [in /home/jvfe/miniconda3/envs/mobsuite/lib/python3.8/site-packages/mob_suite/mob_recon.py:1104] Traceback (most recent call last): File "/home/jvfe/miniconda3/envs/mobsuite/bin/mob_recon", line 10, inRunning
gunzip SAMD00000756.contigs.fa.gz
and then re-running the command above works as expected.I've attached the assembly below. SAMD00000756.contigs.fa.gz