RemiAllio / MitoFinder

MitoFinder: efficient automated large-scale extraction of mitogenomic data from high throughput sequencing data
86 stars 14 forks source link

Absolute path to the scaffold file is not accepted (IOError: [Errno 2] No such file or directory) #34

Closed crcardenas closed 2 years ago

crcardenas commented 2 years ago

I am having an issue testing MitoFinder before running this on all of my samples. I can only run this program if my scaffolds are in the directory I call the program; even though I can copy the scaffold.fasta file into my working directory, or even the directory above, using the exact same path. If I use the absolute path I get this error:

#!/bin/bash/
source /local/anaconda3/bin/activate
conda activate singularity
source ~/.bashrc

cd ~/mito/test2

mitofinder_v1.4.1.sif -j Acilius_canaliculatus_SRR12339052/ \
-a /data/work/Calosoma_phylo/vasil2021/spades/assembled/Acilius_canaliculatus_SRR12339052/scaffolds.fasta \
-r ../reference.gb \
-m 25 -p 4 -o 5 \
--min-contig-size 500

traceback (most recent call last): File "/opt/MitoFinder/mitofinder", line 168, in logfile=open(Logfile,"w") IOError: [Errno 2] No such file or directory: '/data/crcardenas/mito/test2/Acilius_canaliculatus_SRR12339052/_MitoFinder.log'

if I $cp /data/work/Calosoma_phylo/vasil2021/spades/assembled/Acilius_canaliculatus_SRR12339052/scaffolds.fasta ./ and run the code again where -a ./scaffold.fasta; it works fine.

It does not seem that the program can accept the absolute path.

crcardenas commented 2 years ago

I have found a different issue; please ignore.

crcardenas commented 2 years ago

I really do apologize; I described the behavior correctly. I improperly labeled my -j option.

As described before:

Command line: /opt/MitoFinder/mitofinder -j Acilius_canaliculatus_SRR12339052 -a /data/work/Calosoma_phylo/vasil2021/spades/assembled/Acilius_canaliculatus_SRR12339052/scaffolds.fasta -r ../reference.gb -m 25 -p 4 -o 5 --min-contig-size 500

Now running MitoFinder ...

Start time : 2022-06-03 21:04:29

Job name = Acilius_canaliculatus_SRR12339052

ERROR: /data/work/Calosoma_phylo/vasil2021/spades/assembled/Acilius_canaliculatus_SRR12339052/scaffolds.fasta does not exist

but changing my -a option solves the issue.

command line: /opt/MitoFinder/mitofinder -j Acilius_canaliculatus_SRR12339052 -a ./scaffolds.fasta -r ../reference.gb -m 25 -p 4 -o 5 --min-contig-size 500

Now running MitoFinder ...

Start time : 2022-06-03 21:06:40

Job name = Acilius_canaliculatus_SRR12339052

Creating Output directory : /home/cody/mito/test2/Acilius_canaliculatus_SRR12339052
All results will be written here
RemiAllio commented 2 years ago

Hi,

Thank you for contacting me. So, everything is fine now?

Sorry if the documentation wasn't clear...

Best, Rémi

crcardenas commented 2 years ago

Not really. I'm not sure why I can't call the scaffold.fasta file from a different directory. I would expect that the absolute path, the path to the scaffold.fasta file, would work rather than having to copy the scaffold.fasta to the directory MitoFinder is run. I don't want to bloat my storage space with duplicate files.

Just to reiterate; the following script does not work:

#!/bin/bash/
source /local/anaconda3/bin/activate
conda activate singularity
source ~/.bashrc

cd ~/mito/test2

mitofinder_v1.4.1.sif -j Acilius_canaliculatus_SRR12339052 \
-a /data/work/Calosoma_phylo/vasil2021/spades/assembled/Acilius_canaliculatus_SRR12339052/scaffolds.fasta \
-r ../reference.gb \
-m 25 -p 4 -o 5 \
--min-contig-size 500

I get the following error

... ERROR: /data/work/Calosoma_phylo/vasil2021/spades/assembled/Acilius_canaliculatus_SRR12339052/scaffolds.fasta does not exist

but this script does work:

#!/bin/bash/
source /local/anaconda3/bin/activate
conda activate singularity
source ~/.bashrc

cd ~/mito/test2
cp /data/work/Calosoma_phylo/vasil2021/spades/assembled/Acilius_canaliculatus_SRR12339052/scaffolds.fasta ./

mitofinder_v1.4.1.sif -j Acilius_canaliculatus_SRR12339052 \
-a scaffolds.fasta \
-r ../reference.gb \
-m 25 -p 4 -o 5 \
--min-contig-size 500

To try and save space on my shared cluster I have tried creating a symlink pointing to the fasta file. However that still fails.

Thank you for your patience with my multiple posts! Cody

crcardenas commented 2 years ago

Hello again,

I tried another test, closer to where the directory with my assemblies live and have found another possibly related issue.

#!/bin/bash/
source /local/anaconda3/bin/activate
conda activate singularity
source ~/.bashrc

cd /data/work/Calosoma_phylo/mito

mitofinder_v1.4.1.sif -j Rhantus_suturellis_SRR10334072 \
-a /data/work/Calosoma_phylo/baca21_UCE/spades/assembled/Rhantus_suturellis_SRR10334072/scaffolds.fasta \
-r reference.gb \
-m 25 -p 2 -o 5 \
--min-contig-size 500

however this time the error is this:

Command line: /opt/MitoFinder/mitofinder -j Rhantus_suturellis_SRR10334072 -a /data/work/Calosoma_phylo/baca21_UCE/spades/assembled/Rhantus_suturellis_SRR10334072/scaffolds.fasta -r reference.gb -m 25 -p 2 -o 5 --min-contig-size 500

Now running MitoFinder ...

Start time : 2022-06-03 23:29:00

Job name = Rhantus_suturellis_SRR10334072

ERROR: /home/crcardenas/reference.gb does not exist

This of course would be the case, I do not have the file in that directory. My first attempt at correcting this I tried calling the absolute path of the reference.gb file like before.

#!/bin/bash/
source /local/anaconda3/bin/activate
conda activate singularity
source ~/.bashrc

cd /data/work/Calosoma_phylo/mito

mitofinder_v1.4.1.sif -j Rhantus_suturellis_SRR10334072 \
-a /data/work/Calosoma_phylo/baca21_UCE/spades/assembled/Rhantus_suturellis_SRR10334072/scaffolds.fasta \
-r /data/work/Calosoma_phylo/mito/reference.gb \
-m 25 -p 2 -o 5 \
--min-contig-size 500

yet I get the same error in a slightly different way.

Command line: /opt/MitoFinder/mitofinder -j Rhantus_suturellis_SRR10334072 -a /data/work/Calosoma_phylo/baca21_UCE/spades/assembled/Rhantus_suturellis_SRR10334072/scaffolds.fasta -r /data/work/Calosoma_phylo/mito/reference.gb -m 25 -p 2 -o 5 --min-contig-size 500

Now running MitoFinder ...

Start time : 2022-06-03 23:37:00

Job name = Rhantus_suturellis_SRR10334072

ERROR: /data/work/Calosoma_phylo/mito/reference.gb does not exist

But the file is there:

$ pwd && ls

/data/work/Calosoma_phylo/mito reference.gb test-run.sh

Im not sure if this is an issue with the bashrc file and/or because the mitofinder_v1.4.1.sif for singularity lives in a different part of my cluster (user home with limited capacity vs data directory with a lot more of each). So I copied my *.sif file to this new directory and appended my bashrc to reflect this:

$ tail ~/.bashrc

Path to MitoFinder

export PATH=$PATH:/data/work/Calosoma_phylo/mito

Just to be sure I also coppied my reference.gb file to /home/crcardenas/

However, I am now back to my original issue:

#!/bin/bash/
source /local/anaconda3/bin/activate
conda activate singularity
source ~/.bashrc

cd /data/work/Calosoma_phylo/mito

mitofinder_v1.4.1.sif -j Rhantus_suturellis_SRR10334072 \
-a /data/work/Calosoma_phylo/baca21_UCE/spades/assembled/Rhantus_suturellis_SRR10334072/scaffolds.fasta \
-r reference.gb \
-m 25 -p 2 -o 5 \
--min-contig-size 500

Command line: /opt/MitoFinder/mitofinder -j Rhantus_suturellis_SRR10334072 -a /data/work/Calosoma_phylo/baca21_UCE/spades/assembled/Rhantus_suturellis_SRR10334072/scaffolds.fasta -r reference.gb -m 25 -p 2 -o 5 --min-contig-size 500

Now running MitoFinder ...

Start time : 2022-06-03 23:56:08

Job name = Rhantus_suturellis_SRR10334072

ERROR: /data/work/Calosoma_phylo/baca21_UCE/spades/assembled/Rhantus_suturellis_SRR10334072/scaffolds.fasta does not exist

But, this file does exist:

$ ls -l /data/work/Calosoma_phylo/baca21_UCE/spades/assembled/Rhantus_suturellis_SRR10334072/

total 58160 -rw-rw-r-- 1 crcardenas crcardenas 7596584 May 29 13:39 assembly_graph_after_simplification.gfa ... -rw-rw-r-- 1 crcardenas crcardenas 7382530 May 29 13:48 scaffolds.fasta -rw-rw-r-- 1 crcardenas crcardenas 1903993 May 29 13:39 scaffolds.paths -rw-rw-r-- 1 crcardenas crcardenas 141239 May 29 13:48 spades.log drwxrwxr-x 2 crcardenas crcardenas 4096 May 29 13:48 tmp

Any suggestions?

Thanks for your help, Cody

crcardenas commented 2 years ago

One last thing, for good measure I copied the scaffolds.fasta file like I had done before; but MitoFinder is still looking where it shouldn't be.

#!/bin/bash/
source /local/anaconda3/bin/activate
conda activate singularity
source ~/.bashrc

cd /data/work/Calosoma_phylo/mito

mitofinder_v1.4.1.sif -j Rhantus_suturellis_SRR10334072 \
-a ./scaffolds.fasta \
-r ./reference.gb \
-m 25 -p 2 -o 5 \
--min-contig-size 500 

Command line: /opt/MitoFinder/mitofinder -j Rhantus_suturellis_SRR10334072 -a scaffolds.fasta -r reference.gb -m 25 -p 2 -o 5 --min-contig-size 500

Now running MitoFinder ...

Start time : 2022-06-04 00:15:23

Job name = Rhantus_suturellis_SRR10334072

ERROR: /home/crcardenas/./scaffolds.fasta does not exist

RemiAllio commented 2 years ago

Hi!

Sorry for the late reply.

The issue comes from the way singularity binds directories to its environment. By default, singularity loads several directories (as defined by the cluster administrator or, if no rule, at least the /home/user/ directory)

To solve your issue, you need to "bind" your /data/ directory to the singularity environment (using -B option). To do so, you can run MitoFinder like this:

singularity run -B /data/work/Calosoma_phylo/:/data/work/Calosoma_phylo/ mitofinder_v1.4.1.sif -j Rhantus_suturellis_SRR10334072 -a /data/work/Calosoma_phylo/baca21_UCE/spades/assembled/Rhantus_suturellis_SRR10334072/scaffolds.fasta -r ./reference.gb -m 25 -p 2 -o 5 --min-contig-size 500

Let me know if it works!

Best, Rémi

N.B: Please see this page for more information about binding directories with singularity.

RemiAllio commented 2 years ago

P.S: if you are not working from a directory bound by singularity, you may need to bind as well. It should be something like this: singularity run -B \~/mito/:\~/mito/,/data/work/Calosoma_phylo/:/data/work/Calosoma_phylo/ mitofinder_v1.4.1.sif

crcardenas commented 2 years ago

Rémi,

No worries about the delay. This was the solution I needed, I should have read through the singularity documentation.

Thanks for your help and time, I really appreciate it! Cody

RemiAllio commented 2 years ago

Hi,

I am happy to know it's working for you now! 👍 Please let me know if you have any other problems.

Best, Rémi