Closed junaruga closed 2 years ago
Note that the generated_asm.bp.r_utg.fa
(this file) was created from the asm.bp.r_utg.gfa
(output of the hifiasm
) by the following steps.
Here is the used hifiasm version. Maybe it is the latest version.
$ docker run --rm -t quay.io/biocontainers/hifiasm:0.16.1--h5b5514e_1 hifiasm --version
0.16.1-r375
$ awk '/^S/{print ">"$2; print $3}' tmp/hifiasm/asm.bp.r_utg.gfa > tmp/hifiasm/generated_asm.bp.r_utg.fa
Hi @junaruga You might try typing the full path to your current folder instead of $(pwd). Let me know if it works for you.
You might try typing the full path to your current folder instead of $(pwd). Let me know if it works for you.
I tried the full path instead of the $(pwd)
. But the result is the same. Here is the log.
$ pwd
/home/jaruga/tmp/mitohifi
$ ls -hl data
total 3.7M
-rw-r--r--. 1 jaruga jaruga 3.7M Oct 10 14:02 generated_asm.bp.r_utg.fa
-rw-rw-rw-. 1 jaruga jaruga 17K Oct 11 18:15 ON980565.1.fasta
-rw-rw-rw-. 1 jaruga jaruga 35K Oct 11 18:15 ON980565.1.gb
$ docker run --rm -w /data/ -v /home/jaruga/tmp/mitohifi/data/:/data/ -t docker.io/biocontainers/mitohifi:2.2_cv1 \
mitohifi.py -r /data/generated_asm.bp.r_utg.fa -f /data/ON980565.1.fasta -g /data/ON980565.1.gb -t 4 -o 2
2022-10-17 13:28:17 [INFO] Welcome to MitoHifi v2. Starting pipeline...
2022-10-17 13:28:17 [INFO] Length of related mitogenome is: 16574 bp
2022-10-17 13:28:17 [INFO] Number of genes on related mitogenome: 37
2022-10-17 13:28:17 [INFO] Running MitoHifi pipeline in reads mode...
2022-10-17 13:28:17 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome
2022-10-17 13:28:17 [INFO] minimap2 -t 4 --secondary=no -ax map-pb /data/ON980565.1.fasta /data/generated_asm.bp.r_utg.fa | samtools view -@ 4 -S -b -F4 -F 0x800 > reads.HiFiMapped.bam
2022-10-17 13:28:17 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS
2022-10-17 13:28:17 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format:
2022-10-17 13:28:17 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta
2022-10-17 13:28:17 [INFO] Total number of mapped reads: 0
2022-10-17 13:28:17 [INFO] 2.2 Then we filter reads that are larger than 16574 bp
2022-10-17 13:28:17 [INFO] Number of filtered reads: 0
2022-10-17 13:28:17 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads!
2022-10-17 13:28:17 [INFO] hifiasm --primary -t 4 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta
Traceback (most recent call last):
File "/bin/MitoHiFi/mitohifi.py", line 139, in main
f1 = open("gbk.HiFiMapped.bam.filtered.assembled.p_ctg.gfa")
FileNotFoundError: [Errno 2] No such file or directory: 'gbk.HiFiMapped.bam.filtered.assembled.p_ctg.gfa'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/bin/MitoHiFi/mitohifi.py", line 143, in main
An error may have occurred when assembling reads with HiFiasm.""")
SystemExit: No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s).
An error may have occurred when assembling reads with HiFiasm.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/bin/MitoHiFi/mitohifi.py", line 377, in <module>
main()
File "/bin/MitoHiFi/mitohifi.py", line 145, in main
f1.close()
UnboundLocalError: local variable 'f1' referenced before assignment
I debugged the code a bit. In the container, the mitohifi.py
has 377 lines. I expected the code is on the v2.2
git tag. However, on the tag v2.2
, the code has 377 lines. Anyway, I use the mitohifi.py
on the v2.2
to explain for convenience.
$ docker run --rm -it docker.io/biocontainers/mitohifi:2.2_cv1 bash
root@393dbe4cedb7:/# wc -l /bin/MitoHiFi//mitohifi.py
377 /bin/MitoHiFi//mitohifi.py
The error happens on the
The file gbk.HiFiMapped.bam.filtered.assembled.p_ctg.gfa
is excepted to be created as an output by the hifiasm command below. But it is actually not created.
hifiasm --primary -t 4 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta
The gbk.HiFiMapped.bam.filtered.fasta
file used as an input of the hifiasm
command exists as zero byte.
$ ls -hl gbk.HiFiMapped.bam.filtered.fasta
-rw-r--r--. 1 root root 0 Oct 11 18:31 gbk.HiFiMapped.bam.filtered.fasta
The gbk.HiFiMapped.bam.filtered.fasta
file is the output from the command below.
And the gbk.HiFiMapped.bam.fasta
as an input of the command exists as zero byte.
$ ls -hl gbk.HiFiMapped.bam.fasta
-rw-r--r--. 1 root root 0 Oct 11 18:31 gbk.HiFiMapped.bam.fasta
$ ls -hl reads.HiFiMapped.bam
-rw-r--r--. 1 root root 212 Oct 11 18:31 reads.HiFiMapped.bam
So, the log shows that the samtools fasta
created from the non-zero byte reads.HiFiMapped.bam
to zero byte gbk.HiFiMapped.bam.fasta
.
samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta
And this is the mitohifi.py
command line option -r
.
mutually_exclusive_group.add_argument("-r", help= "-r: Pacbio Hifi Reads from your species", metavar='<reads>.fasta')
My concern is the specified generated_asm.bp.r_utg.fa
above is nuclear DNA, not mitocondria DNA. Can this cause the error above?
exampleFiles
dataI tested the following example command mentioned in this repository's README.md
. It succeeded. However, the command doesn't use the mitohifi.py
's -r
option. So, it doesn't be a clue for this issue.
$ docker run --rm -w / -v /home/jaruga/tmp/mitohifi/exampleFiles/:/exampleFiles/ -t docker.io/biocontainers/mitohifi:2.2_cv1 \
mitohifi.py -c exampleFiles/test.fa -f exampleFiles/NC_016067.1.fasta -g exampleFiles/NC_016067.1.gb -t 1 -o 5
And when using the `-r option with the example data, I saw the same error with the case that I executed with my data. Could you show me the
$ docker run --rm -w / -v /home/jaruga/tmp/mitohifi/exampleFiles/:/exampleFiles/ -t docker.io/biocontainers/mitohifi:2.2_cv1 \
mitohifi.py -r exampleFiles/test.fa -f exampleFiles/NC_016067.1.fasta -g exampleFiles/NC_016067.1.gb -t 1 -o 5
2022-10-17 14:21:57 [INFO] Welcome to MitoHifi v2. Starting pipeline...
2022-10-17 14:21:57 [INFO] Length of related mitogenome is: 15659 bp
2022-10-17 14:21:57 [INFO] Number of genes on related mitogenome: 37
2022-10-17 14:21:57 [INFO] Running MitoHifi pipeline in reads mode...
2022-10-17 14:21:57 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome
2022-10-17 14:21:57 [INFO] minimap2 -t 1 --secondary=no -ax map-pb exampleFiles/NC_016067.1.fasta exampleFiles/test.fa | samtools view -@ 1 -S -b -F4 -F 0x800 > reads.HiFiMapped.bam
2022-10-17 14:22:05 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS
2022-10-17 14:22:05 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format:
2022-10-17 14:22:05 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta
2022-10-17 14:22:06 [INFO] Total number of mapped reads: 3
2022-10-17 14:22:06 [INFO] 2.2 Then we filter reads that are larger than 15659 bp
2022-10-17 14:22:06 [INFO] Number of filtered reads: 0
2022-10-17 14:22:06 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads!
2022-10-17 14:22:06 [INFO] hifiasm --primary -t 1 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta
Traceback (most recent call last):
File "/bin/MitoHiFi/mitohifi.py", line 139, in main
f1 = open("gbk.HiFiMapped.bam.filtered.assembled.p_ctg.gfa")
FileNotFoundError: [Errno 2] No such file or directory: 'gbk.HiFiMapped.bam.filtered.assembled.p_ctg.gfa'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/bin/MitoHiFi/mitohifi.py", line 143, in main
An error may have occurred when assembling reads with HiFiasm.""")
SystemExit: No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s).
An error may have occurred when assembling reads with HiFiasm.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/bin/MitoHiFi/mitohifi.py", line 377, in <module>
main()
File "/bin/MitoHiFi/mitohifi.py", line 145, in main
f1.close()
UnboundLocalError: local variable 'f1' referenced before assignment
$ ls -hl exampleFiles
total 39M
-rw-r--r--. 1 jaruga jaruga 1 Oct 11 17:17 .gitkeep
-rw-r--r--. 1 jaruga jaruga 15M Oct 11 17:17 ilDeiPorc1.reads.fa
-rw-r--r--. 1 jaruga jaruga 16K Oct 11 17:17 MW539688.1.fasta
-rw-r--r--. 1 jaruga jaruga 33K Oct 11 17:17 MW539688.1.gb
-rw-r--r--. 1 jaruga jaruga 16K Oct 11 17:17 NC_016067.1.fasta
-rw-r--r--. 1 jaruga jaruga 38K Oct 11 17:17 NC_016067.1.gb
-rw-r--r--. 1 jaruga jaruga 24M Oct 11 17:17 test.fa
I checked the bam file above with the samtools
on my local environment.
$ ls -hl reads.HiFiMapped.bam
-rw-r--r--. 1 jaruga jaruga 212 Oct 11 18:31 reads.HiFiMapped.bam
$ samtools --version | head -1
samtools 1.13
$ samtools tview reads.HiFiMapped.bam
[E::idx_find_and_load] Could not retrieve index file for 'reads.HiFiMapped.bam'
samtools tview: cannot read index for "reads.HiFiMapped.bam"
$ samtools idxstats reads.HiFiMapped.bam
[E::idx_find_and_load] Could not retrieve index file for 'reads.HiFiMapped.bam'
samtools idxstats: fail to load index for "reads.HiFiMapped.bam", reverting to slow method
ON980565.1 16574 0 0
* 0 0 0
I prepared the reproducing script for you to check this issue on your local environment. If you have docker or other container tools, you can check it in your environment. The instruction is on the README.md
.
https://github.com/junaruga/report-mitohifi-no-such-file
@marcelauliano Do you have any comments? I appreciate it.
Hi @junaruga
Thanks for checking.
MitoHiFi was designed to look for organelles, in particular for mitochodrion.
-r is the flag used to perform assembly from the raw reads, while -c is used to look for a mitogenome among the input contigs. The name generated_asm.bp.r_utg.fa
suggests that it's not reads, but the contigs.
Did you try to run with -c flag?
With the example files the command line should be as following see here :
assemble from contigs:
docker run --rm -w /data/ -v /local/path/to/exampleFiles/:/data/ -t docker.io/biocontainers/mitohifi:2.2_cv1 mitohifi.py -c /data/test.fa -f /data/NC_016067.1.fasta -g /data/NC_016067.1.gb -t 4 -o 2
assemble from raw reads:
docker run --rm -w /data/ -v /local/path/to/exampleFiles/:/data/ -t docker.io/biocontainers/mitohifi:2.2_cv1 mitohifi.py -r /data/ilDeiPorc1.reads.fa -f /data/MW539688.1.fasta -g /data/MW539688.1.gb -t 4 -o 2
Hope this helps.
@ksenia-krasheninnikova Thanks for your info and help.
I was not clear to run what kind of FASTA file, reads, or contigs file. I want to run with the reads file, but I was using the configs file. So, using this repository's example files, I was able to run the mitohifi.py -r <reads file> ...
. The result is below. The full log is here.
$ docker run --rm -w /data/ -v /home/jaruga/tmp/mitohifi/exampleFiles/:/data/ -t docker.io/biocontainers/mitohifi:2.2_cv1 mitohifi.py -r /data/ilDeiPorc1.reads.fa -f /data/MW539688.1.fasta -g /data/MW539688.1.gb -t 4 -o 2
...
2022-10-18 12:35:21 [INFO] Pipeline finished!
2022-10-18 12:35:21 [INFO] Run time: 4500.39 seconds
$ echo $?
0
$ ls -lh exampleFiles
total 39M
drwxr-xr-x. 1 root root 72 Oct 18 14:35 contigs_circularization/
drwxr-xr-x. 1 root root 130 Oct 18 14:35 contigs_filtering/
-rw-r--r--. 1 root root 17K Oct 18 14:35 contigs_stats.tsv
drwxr-xr-x. 1 root root 352 Oct 18 14:35 final_mitogenome.annotation/
-rw-r--r--. 1 root root 1.4K Oct 18 14:35 final_mitogenome.annotation_MitoFinder.log
drwxr-xr-x. 1 root root 168 Oct 18 14:35 final_mitogenome_choice/
-rw-r--r--. 1 root root 15K Oct 18 14:35 final_mitogenome.fasta
-rw-r--r--. 1 root root 30K Oct 18 14:35 final_mitogenome.gb
-rw-r--r--. 1 jaruga jaruga 1 Oct 11 17:17 .gitkeep
-rw-r--r--. 1 jaruga jaruga 15M Oct 11 17:17 ilDeiPorc1.reads.fa
-rw-r--r--. 1 jaruga jaruga 16K Oct 11 17:17 MW539688.1.fasta
-rw-r--r--. 1 jaruga jaruga 33K Oct 11 17:17 MW539688.1.gb
-rw-r--r--. 1 jaruga jaruga 16K Oct 11 17:17 NC_016067.1.fasta
-rw-r--r--. 1 jaruga jaruga 38K Oct 11 17:17 NC_016067.1.gb
drwxr-xr-x. 1 root root 1.5K Oct 18 14:35 potential_contigs/
drwxr-xr-x. 1 root root 1.9K Oct 18 14:35 reads_mapping_and_assembly/
-rw-r--r--. 1 jaruga jaruga 24M Oct 11 17:17 test.fa
$ ls -lh exampleFiles/final_mitogenome.fasta
-rw-r--r--. 1 root root 15K Oct 18 14:35 exampleFiles/final_mitogenome.fasta
$ ls -lh exampleFiles/final_mitogenome.gb
-rw-r--r--. 1 root root 30K Oct 18 14:35 exampleFiles/final_mitogenome.gb
$ ls -lh exampleFiles/contigs_stats.tsv
-rw-r--r--. 1 root root 17K Oct 18 14:35 exampleFiles/contigs_stats.tsv
Right now I only have the FASTA file created from the nuclear DNA. So, when using the reads file, I still go an error above with the message "Total number of mapped reads: 0". So, I will find a mitocondria DNA FASTA file. So, I would close this issue ticket. Thanks for your help.
$ ls -lh data
total 36M
-rw-r--r--. 1 jaruga jaruga 17K Oct 11 18:15 ON980565.1.fasta
-rw-r--r--. 1 jaruga jaruga 35K Oct 11 18:15 ON980565.1.gb
-rw-r--r--. 1 jaruga jaruga 36M Oct 18 12:22 test.hifi.fasta
$ docker run --rm -w /data -v /home/jaruga/tmp/mitohifi/data/:/data/ -t docker.io/biocontainers/mitohifi:2.2_cv1 \
mitohifi.py -r /data/test.hifi.fasta -f /data/ON980565.1.fasta -g /data/ON980565.1.gb -t 4 -o 2
2022-10-18 11:11:14 [INFO] Welcome to MitoHifi v2. Starting pipeline...
2022-10-18 11:11:14 [INFO] Length of related mitogenome is: 16574 bp
2022-10-18 11:11:14 [INFO] Number of genes on related mitogenome: 37
2022-10-18 11:11:14 [INFO] Running MitoHifi pipeline in reads mode...
2022-10-18 11:11:14 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome
2022-10-18 11:11:14 [INFO] minimap2 -t 4 --secondary=no -ax map-pb /data/ON980565.1.fasta /data/test.hifi.fasta | samtools view -@ 4 -S -b -F4 -F 0x800 > reads.HiFiMapped.bam
2022-10-18 11:11:14 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS
2022-10-18 11:11:14 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format:
2022-10-18 11:11:14 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta
2022-10-18 11:11:14 [INFO] Total number of mapped reads: 0
2022-10-18 11:11:14 [INFO] 2.2 Then we filter reads that are larger than 16574 bp
2022-10-18 11:11:14 [INFO] Number of filtered reads: 0
2022-10-18 11:11:14 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads!
2022-10-18 11:11:14 [INFO] hifiasm --primary -t 4 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta
Traceback (most recent call last):
File "/bin/MitoHiFi/mitohifi.py", line 139, in main
f1 = open("gbk.HiFiMapped.bam.filtered.assembled.p_ctg.gfa")
FileNotFoundError: [Errno 2] No such file or directory: 'gbk.HiFiMapped.bam.filtered.assembled.p_ctg.gfa'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/bin/MitoHiFi/mitohifi.py", line 143, in main
An error may have occurred when assembling reads with HiFiasm.""")
SystemExit: No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s).
An error may have occurred when assembling reads with HiFiasm.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/bin/MitoHiFi/mitohifi.py", line 377, in <module>
main()
File "/bin/MitoHiFi/mitohifi.py", line 145, in main
f1.close()
UnboundLocalError: local variable 'f1' referenced before assignment
Perhaps, the issue might be essentially the same with the https://github.com/marcelauliano/MitoHiFi/issues/10 .
I am using MitoHiFi's latest docker container image.
First, I created the reference .fasta and .db files by
findMitoReference.py
.Then then I copied a fast file that was originally created by hifiasm, and that I modified. The
generated_asm.bp.r_utg.fa
file is here.Here is the MitoHifi's version.
Then I executed the
mitohifi.py
with the fast file above and reference .fasta and .db files as input files. I got the following error. Do you know what's wrong? You can reproduce it in your environment, as I shared my data. Thank you.Here is the working directory's status after executing the command
mitohifi.py
above.