Open athenasyarifa opened 2 years ago
Hi Rifa, Seems like you don't have enough reads for hifiasm. 1-) Can you generate the statistics of your filtered reads? 2-) Can you do the same for the unfiltered? samtools fasta gbk.HiFiMapped.bam.fasta
Depending on these statistics, it's possible you could use the filtered reads as -c
Em qui., 11 de ago. de 2022 às 16:57, Athena Syarifa < @.***> escreveu:
Hi Marcela and fellow contributors, Thank you for developing such useful tool! I have been trying to assemble mitogenome of a bird species using raw reads with this command:
-r $reads/m64077_211118_103836.hifi_reads.fasta.gz $reads/m64204e_211125_011116.hifi_reads.fasta.gz \ -f $ref/KP137624.1.fasta \ -g $ref/KP137624.1.gb \ -t 16 -a animal
But then, I get the following error:
2022-08-11 16:51:06 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2022-08-11 16:51:06 [INFO] Length of related mitogenome is: 16776 bp 2022-08-11 16:51:06 [INFO] Number of genes on related mitogenome: 37 2022-08-11 16:51:06 [INFO] Running MitoHifi pipeline in reads mode... 2022-08-11 16:51:06 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome 2022-08-11 16:51:06 [INFO] minimap2 -t 16 --secondary=no -ax map-pb /dss/dsslegfs01/pr53da/pr53da-dss-0026/projects/2022de_novo_genome/assembly/mitochondria/KP137624.1.fasta /dss/dsslegfs01/pr53da/pr53da-dss-0026/projects/2022de_novo_genome/data/DNA_PacBio/m64204e_211125_011116.hifi_reads.fasta.gz | samtools view -@ 16 -S -b -F4 -F 0x800 > reads.HiFiMapped.bam 2022-08-11 16:54:53 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS 2022-08-11 16:54:53 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format: 2022-08-11 16:54:53 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta 2022-08-11 16:54:54 [INFO] Total number of mapped reads: 90 2022-08-11 16:54:54 [INFO] 2.2 Then we filter reads that are larger than 16776 bp 2022-08-11 16:54:54 [INFO] Number of filtered reads: 5 2022-08-11 16:54:54 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads! 2022-08-11 16:54:54 [INFO] hifiasm --primary -t 16 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta 2022-08-11 16:54:54 [INFO] 4. Let's run the blast of the contigs versus the close-related mitogenome 2022-08-11 16:54:54 [INFO] 4.1. Creating BLAST database: 2022-08-11 16:54:54 [INFO] makeblastdb -in /dss/dsslegfs01/pr53da/pr53da-dss-0026/projects/2022de_novo_genome/assembly/mitochondria/KP137624.1.fasta -dbtype nucl 2022-08-11 16:54:54 [INFO] Makeblastdb done. 2022-08-11 16:54:54 [INFO] 4.2. Running blast of contigs against close-related mitogenome: 2022-08-11 16:54:54 [INFO] blastn -query hifiasm.contigs.fasta -db /dss/dsslegfs01/pr53da/pr53da-dss-0026/projects/2022de_novo_genome/assembly/mitochondria/KP137624.1.fasta -num_threads 16 -out contigs.blastn -outfmt 6 std qlen slen Warning: [blastn] Query is Empty! 2022-08-11 16:54:54 [INFO] Blast done. 2022-08-11 16:54:54 [INFO] 5. Filtering BLAST output to select target sequences 2022-08-11 16:54:54 [INFO] Filtering thresholds applied: 2022-08-11 16:54:54 [INFO] Minimum query percentage = 50 2022-08-11 16:54:54 [INFO] Minimum query length = 80% subject length 2022-08-11 16:54:54 [INFO] Maximum query length = 5 times subject length Attention! 'parsed_blast.txt' and 'parsed_blast_all.txt' files are empty. The pipeline has stopped !! You need to run further scripts to check if you have mito reads pulled to a large NUMT!
I thought it was because of the same problem as issue #10 https://github.com/marcelauliano/MitoHiFi/issues/10 , but the command for hifiasm is correct. I have the filtered reads (although only 5) but then the contigs.blastn file and all of the hifiasm output files are empty (including hifiasm.contigs.fasta). I am not sure why this happened. Can someone help me?
This is how my directory looks:
-rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 contigs.blastn -rw-rw---- 1 ra35naf pr53da-dss-0026 1.8M Aug 11 16:54 gbk.HiFiMapped.bam.fasta -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.a_ctg.fa -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.a_ctg.gfa -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.a_ctg.lowQ.bed -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.a_ctg.noseq.gfa -rw-rw---- 1 ra35naf pr53da-dss-0026 28K Jul 24 13:47 gbk.HiFiMapped.bam.filtered.assembled.ec.bin -rw-rw---- 1 ra35naf pr53da-dss-0026 38 Jul 24 13:47 gbk.HiFiMapped.bam.filtered.assembled.ovlp.reverse.bin -rw-rw---- 1 ra35naf pr53da-dss-0026 122 Jul 24 13:47 gbk.HiFiMapped.bam.filtered.assembled.ovlp.source.bin -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.p_ctg.fa -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.p_ctg.gfa -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.p_ctg.lowQ.bed -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.p_ctg.noseq.gfa -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.p_utg.gfa -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.p_utg.lowQ.bed -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.p_utg.noseq.gfa -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.r_utg.gfa -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.r_utg.lowQ.bed -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 gbk.HiFiMapped.bam.filtered.assembled.r_utg.noseq.gfa -rw-rw---- 1 ra35naf pr53da-dss-0026 80K Aug 11 16:54 gbk.HiFiMapped.bam.filtered.fasta -rw-rw---- 1 ra35naf pr53da-dss-0026 0 Aug 11 16:54 hifiasm.contigs.fasta -rw-rw---- 1 ra35naf pr53da-dss-0026 989 Aug 11 16:54 hifiasm.log -rw-rw---- 1 ra35naf pr53da-dss-0026 906 Jul 24 13:43 mito.sh -rw-rw---- 1 ra35naf pr53da-dss-0026 44 Aug 11 16:54 parsed_blast_all.txt -rw-rw---- 1 ra35naf pr53da-dss-0026 44 Aug 11 16:54 parsed_blast.txt -rw-rw---- 1 ra35naf pr53da-dss-0026 541K Aug 11 16:54 reads.HiFiMapped.bam
Many thanks, and have a good day! Best, Rifa
— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/24, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RFEIWD2XOHY3SOKYALVYUPFVANCNFSM56IWFE2Q . You are receiving this because you are subscribed to this thread.Message ID: @.***>
-- Marcela Uliano da Silva, PhD
Senior Bioinformatician - Wellcome Sanger Institute Darwin Tree of Life Project
Churchill College Cambridge By-Fellow
Cambridge, UK
Hi Marcela, Thanks for the prompt response. Yes, you are right. I only have 5 reads as an input to hifiasm because it seems that MitoHiFi filtered the reads that are longer than the reference mitogenome. Below is the statistics for the filtered reads:
A C G T N IUPAC Other GC GC_stdev
0.3033 0.2109 0.1986 0.2872 0.0000 0.0000 0.0000 0.4095 0.0427
Main genome scaffold total: 5
Main genome contig total: 5
Main genome scaffold sequence total: 0.080 MB
Main genome contig sequence total: 0.080 MB 0.000% gap
Main genome scaffold N/L50: 3/15.962 KB
Main genome contig N/L50: 3/15.962 KB
Main genome scaffold N/L90: 5/15.214 KB
Main genome contig N/L90: 5/15.214 KB
Max scaffold length: 16.507 KB
Max contig length: 16.507 KB
Number of scaffolds > 50 KB: 0
% main genome in scaffolds > 50 KB: 0.00%
Minimum Number Number Total Total Scaffold
Scaffold of of Scaffold Contig Contig
Length Scaffolds Contigs Length Length Coverage
-------- -------------- -------------- -------------- -------------- --------
All 5 5 79,540 79,540 100.00%
10 KB 5 5 79,540 79,540 100.00%
And below is the statistics for the unfiltered reads:
A C G T N IUPAC Other GC GC_stdev
0.2893 0.2068 0.2098 0.2941 0.0000 0.0000 0.0000 0.4166 0.0440
Main genome scaffold total: 90
Main genome contig total: 90
Main genome scaffold sequence total: 1.883 MB
Main genome contig sequence total: 1.883 MB 0.000% gap
Main genome scaffold N/L50: 41/21.041 KB
Main genome contig N/L50: 41/21.041 KB
Main genome scaffold N/L90: 79/17.708 KB
Main genome contig N/L90: 79/17.708 KB
Max scaffold length: 27.951 KB
Max contig length: 27.951 KB
Number of scaffolds > 50 KB: 0
% main genome in scaffolds > 50 KB: 0.00%
Minimum Number Number Total Total Scaffold
Scaffold of of Scaffold Contig Contig
Length Scaffolds Contigs Length Length Coverage
-------- -------------- -------------- -------------- -------------- --------
All 90 90 1,883,185 1,883,185 100.00%
10 KB 90 90 1,883,185 1,883,185 100.00%
25 KB 5 5 131,208 131,208 100.00%
Do you mean I can use the filtered reads as input for MitoHiFi contigs
mode using the flag -c
? Can you confirm?
Many thanks!
Cheers, Rifa
Is there a reason MitoHiFi filtered the reads that are more than 5 times the subject length? I also quickly tried the contigs
mode using my draft assembly yesterday and received the same error (that the 'parsed_blast.txt' and 'parsed_blast_all.txt' files are empty
). I think it is because MitoHiFi filtered all of the sequences after the blast step due to the length of my contigs/scaffolds which are much longer than the reference mitogenome. I am not sure if it is something I have done wrong or if MitoHiFi is not the appropriate tool for my case.
Contigs that are more than 5x larger will be a numt. I think your problem is your sequenced sample, not the tool =)
You can try using the filtered reads and contigs, yes. (-c) Let;s see if this 15kb read is your mito
Em sex., 12 de ago. de 2022 às 11:49, Athena Syarifa < @.***> escreveu:
Is there a reason MitoHiFi filtered the reads that are more than 5 times the subject length? I also quickly tried the contigs mode using my draft assembly yesterday and received the same error (that the 'parsed_blast.txt' and 'parsed_blast_all.txt' files are empty). I think it is because MitoHiFi filtered all of the sequences after the blast step due to the length of my contigs/scaffolds which are much longer than the reference mitogenome. I am not sure if it is something I have done wrong or if MitoHiFi is not the appropriate tool for my case.
— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/24#issuecomment-1212979157, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RHRD6R6D32IZ7KG64DVYYTZ5ANCNFSM56IWFE2Q . You are receiving this because you commented.Message ID: @.***>
-- Marcela Uliano da Silva, PhD
Senior Bioinformatician - Wellcome Sanger Institute Darwin Tree of Life Project
Churchill College Cambridge By-Fellow
Cambridge, UK
Ah I see, of course. I tried using the filtered reads in contig mode and encountered another error (I think in the annotation step). I suspect there might be an error when running the MitoFinder?
2022-08-12 14:04:25 [INFO] Welcome to MitoHifi v2. Starting pipeline...
2022-08-12 14:04:26 [INFO] Length of related mitogenome is: 16776 bp
2022-08-12 14:04:26 [INFO] Number of genes on related mitogenome: 37
2022-08-12 14:04:26 [INFO] Running MitoHifi pipeline in contigs mode...
2022-08-12 14:04:26 [INFO] 1. Fixing potentially conflicting FASTA headers
2022-08-12 14:04:26 [INFO] 2. Let's run the blast of the contigs versus the close-related mitogenome
2022-08-12 14:04:26 [INFO] 2.1. Creating BLAST database:
2022-08-12 14:04:26 [INFO] makeblastdb -in /dss/dsslegfs01/pr53da/pr53da-dss-0026/projects/2022__de_novo_genome/assembly/mitochondria/KP137624.1.fasta -dbtype nucl
2022-08-12 14:04:26 [INFO] Makeblastdb done.
2022-08-12 14:04:26 [INFO] 2.2. Running blast of contigs against close-related mitogenome:
2022-08-12 14:04:26 [INFO] blastn -query gbk.HiFiMapped.bam.filtered.fasta -db /dss/dsslegfs01/pr53da/pr53da-dss-0026/projects/2022__de_novo_genome/assembly/mitochondria/KP137624.1.fasta -num_threads 16 -out contigs.blastn -outfmt 6 std qlen slen
2022-08-12 14:04:26 [INFO] Blast done.
2022-08-12 14:04:26 [INFO] 3. Filtering BLAST output to select target sequences
2022-08-12 14:04:26 [INFO] Filtering thresholds applied:
2022-08-12 14:04:26 [INFO] Minimum query percentage = 50
2022-08-12 14:04:26 [INFO] Minimum query length = 80% subject length
2022-08-12 14:04:26 [INFO] Maximum query length = 5 times subject length
2022-08-12 14:04:27 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file
2022-08-12 14:04:27 [INFO] 4. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s).
2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/163972531/ccs
2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/166725359/ccs
2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/24707997/ccs
2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/64488503/ccs
2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/75366945/ccs
Traceback (most recent call last):
File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/mitohifi.py", line 376, in <module>
main()
File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/mitohifi.py", line 262, in main
tRNA_ref = fetch.get_ref_tRNA()
File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/fetch.py", line 40, in get_ref_tRNA
reference_tRNA = max(tRNAs, key=tRNAs.get)
ValueError: max() arg is an empty sequence
Do you have any suggestions on what I should do? Thank you ^^
Ah! Try editing your reads ids not to have any | or / or \
On Fri, 12 Aug 2022 at 13:11, Athena Syarifa @.***> wrote:
Ah I see, of course. I tried using the filtered reads in contig mode and encountered another error (I think in the annotation step). I suspect there might be an error when running the MitoFinder?
2022-08-12 14:04:25 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2022-08-12 14:04:26 [INFO] Length of related mitogenome is: 16776 bp 2022-08-12 14:04:26 [INFO] Number of genes on related mitogenome: 37 2022-08-12 14:04:26 [INFO] Running MitoHifi pipeline in contigs mode... 2022-08-12 14:04:26 [INFO] 1. Fixing potentially conflicting FASTA headers 2022-08-12 14:04:26 [INFO] 2. Let's run the blast of the contigs versus the close-related mitogenome 2022-08-12 14:04:26 [INFO] 2.1. Creating BLAST database: 2022-08-12 14:04:26 [INFO] makeblastdb -in /dss/dsslegfs01/pr53da/pr53da-dss-0026/projects/2022de_novo_genome/assembly/mitochondria/KP137624.1.fasta -dbtype nucl 2022-08-12 14:04:26 [INFO] Makeblastdb done. 2022-08-12 14:04:26 [INFO] 2.2. Running blast of contigs against close-related mitogenome: 2022-08-12 14:04:26 [INFO] blastn -query gbk.HiFiMapped.bam.filtered.fasta -db /dss/dsslegfs01/pr53da/pr53da-dss-0026/projects/2022de_novo_genome/assembly/mitochondria/KP137624.1.fasta -num_threads 16 -out contigs.blastn -outfmt 6 std qlen slen 2022-08-12 14:04:26 [INFO] Blast done. 2022-08-12 14:04:26 [INFO] 3. Filtering BLAST output to select target sequences 2022-08-12 14:04:26 [INFO] Filtering thresholds applied: 2022-08-12 14:04:26 [INFO] Minimum query percentage = 50 2022-08-12 14:04:26 [INFO] Minimum query length = 80% subject length 2022-08-12 14:04:26 [INFO] Maximum query length = 5 times subject length 2022-08-12 14:04:27 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file 2022-08-12 14:04:27 [INFO] 4. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s). 2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/163972531/ccs 2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/166725359/ccs 2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/24707997/ccs 2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/64488503/ccs 2022-08-12 14:04:27 [INFO] Working with contig m64204e_211125_011116/75366945/ccs Traceback (most recent call last): File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/mitohifi.py", line 376, in
main() File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/mitohifi.py", line 262, in main tRNA_ref = fetch.get_ref_tRNA() File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/fetch.py", line 40, in get_ref_tRNA reference_tRNA = max(tRNAs, key=tRNAs.get) ValueError: max() arg is an empty sequence Do you have any suggestions on what I should do? Thank you ^^
— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/24#issuecomment-1213046577, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RHHFU7DW26UQ37E3LLVYY5QFANCNFSM56IWFE2Q . You are receiving this because you commented.Message ID: @.***>
-- Marcela Uliano da Silva, PhD
Senior Bioinformatician - Wellcome Sanger Institute Darwin Tree of Life Project
Churchill College Cambridge By-Fellow
Cambridge, UK
Ah, it worked! Up until circularization everything worked fine, but the same error appeared again when annotating even though I have deleted / or \ from my seq IDs.
2022-08-12 15:41:53 [INFO] 3. Filtering BLAST output to select target sequences
2022-08-12 15:41:53 [INFO] Filtering thresholds applied:
2022-08-12 15:41:53 [INFO] Minimum query percentage = 50
2022-08-12 15:41:53 [INFO] Minimum query length = 80% subject length
2022-08-12 15:41:53 [INFO] Maximum query length = 5 times subject length
2022-08-12 15:41:53 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file
2022-08-12 15:41:53 [INFO] 4. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s).
2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-75366945-ccs
2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-24707997-ccs
2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-64488503-ccs
2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-163972531-ccs
2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-166725359-ccs
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-75366945-ccs circularization
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-24707997-ccs circularization
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-64488503-ccs circularization
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-163972531-ccs circularization
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-166725359-ccs circularization
2022-08-12 15:41:54 [INFO] m64204e_211125_011116-64488503-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-64488503-ccs/m64204e_211125_011116-64488503-ccs.circularisationCheck.txt
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-64488503-ccs (MitoFinder) annotation
2022-08-12 15:41:54 [INFO] m64204e_211125_011116-163972531-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-163972531-ccs/m64204e_211125_011116-163972531-ccs.circularisationCheck.txt
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-163972531-ccs (MitoFinder) annotation
2022-08-12 15:41:54 [INFO] m64204e_211125_011116-24707997-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-24707997-ccs/m64204e_211125_011116-24707997-ccs.circularisationCheck.txt
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-24707997-ccs (MitoFinder) annotation
2022-08-12 15:41:54 [INFO] m64204e_211125_011116-166725359-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-166725359-ccs/m64204e_211125_011116-166725359-ccs.circularisationCheck.txt
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-166725359-ccs (MitoFinder) annotation
2022-08-12 15:41:54 [INFO] m64204e_211125_011116-75366945-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-75366945-ccs/m64204e_211125_011116-75366945-ccs.circularisationCheck.txt
2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-75366945-ccs (MitoFinder) annotation
Traceback (most recent call last):
File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/mitohifi.py", line 376, in <module>
main()
File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/mitohifi.py", line 262, in main
tRNA_ref = fetch.get_ref_tRNA()
File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/fetch.py", line 40, in get_ref_tRNA
reference_tRNA = max(tRNAs, key=tRNAs.get)
ValueError: max() arg is an empty sequence
Have a ding on the outputs. My guess is that you don’t have tRNA-Phe. So you probably only have a partial mito! 🤔
On Fri, 12 Aug 2022 at 14:56, Athena Syarifa @.***> wrote:
Ah, it worked! Up until circularization everything worked fine, but the same error appeared again when annotating even though I have deleted / or \ from my seq IDs.
2022-08-12 15:41:53 [INFO] 3. Filtering BLAST output to select target sequences 2022-08-12 15:41:53 [INFO] Filtering thresholds applied: 2022-08-12 15:41:53 [INFO] Minimum query percentage = 50 2022-08-12 15:41:53 [INFO] Minimum query length = 80% subject length 2022-08-12 15:41:53 [INFO] Maximum query length = 5 times subject length 2022-08-12 15:41:53 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file 2022-08-12 15:41:53 [INFO] 4. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s). 2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-75366945-ccs 2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-24707997-ccs 2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-64488503-ccs 2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-163972531-ccs 2022-08-12 15:41:54 [INFO] Working with contig m64204e_211125_011116-166725359-ccs 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-75366945-ccs circularization 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-24707997-ccs circularization 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-64488503-ccs circularization 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-163972531-ccs circularization 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-166725359-ccs circularization 2022-08-12 15:41:54 [INFO] m64204e_211125_011116-64488503-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-64488503-ccs/m64204e_211125_011116-64488503-ccs.circularisationCheck.txt 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-64488503-ccs (MitoFinder) annotation 2022-08-12 15:41:54 [INFO] m64204e_211125_011116-163972531-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-163972531-ccs/m64204e_211125_011116-163972531-ccs.circularisationCheck.txt 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-163972531-ccs (MitoFinder) annotation 2022-08-12 15:41:54 [INFO] m64204e_211125_011116-24707997-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-24707997-ccs/m64204e_211125_011116-24707997-ccs.circularisationCheck.txt 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-24707997-ccs (MitoFinder) annotation 2022-08-12 15:41:54 [INFO] m64204e_211125_011116-166725359-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-166725359-ccs/m64204e_211125_011116-166725359-ccs.circularisationCheck.txt 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-166725359-ccs (MitoFinder) annotation 2022-08-12 15:41:54 [INFO] m64204e_211125_011116-75366945-ccs circularization done. Circularization info saved on ./potential_contigs/m64204e_211125_011116-75366945-ccs/m64204e_211125_011116-75366945-ccs.circularisationCheck.txt 2022-08-12 15:41:54 [INFO] Started m64204e_211125_011116-75366945-ccs (MitoFinder) annotation Traceback (most recent call last): File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/mitohifi.py", line 376, in
main() File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/mitohifi.py", line 262, in main tRNA_ref = fetch.get_ref_tRNA() File "/dss/dsshome1/lxc0C/ra35naf/miniconda3/envs/mito/bin/MitoHiFi/fetch.py", line 40, in get_ref_tRNA reference_tRNA = max(tRNAs, key=tRNAs.get) ValueError: max() arg is an empty sequence — Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/24#issuecomment-1213138761, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RF2JNVTCUCQ7JA6MHLVYZJYRANCNFSM56IWFE2Q . You are receiving this because you commented.Message ID: @.***>
-- Marcela Uliano da Silva, PhD
Senior Bioinformatician - Wellcome Sanger Institute Darwin Tree of Life Project
Churchill College Cambridge By-Fellow
Cambridge, UK
Hmmm that is strange indeed. I will take a closer look into the matter. Many thanks for the help :)
Hi Marcela,
I have succeeded in generating the final mitogenome. I reinstalled MitoHiFi and all its dependencies, and it worked. However, in the MitoFinder log, it says Evidences of circularization could not be found, but everyother step was successful
, and it did not find any genes in the mtDNA aside from 8 tRNAs.
final_mitogenome.annotation_MitoFinder.log
And below is the circularisation check results (all false) all_contigs.circularisationCheck.txt
It seems that MitoFinder cannot find the circularisation, do you have any suggestions about this problem?
Many thanks for your help!
Hi Athena! Your problem now is not the pipeline anymore. It’s possible you don’t have mitoreads in your sample. You need to investigate the intermediate files such as the filtered reads. Best of luck. If we can help with anymore more specific, let us know! On Wed, 14 Sep 2022 at 10:16, Athena Syarifa @.***> wrote:
Hi Marcela, I have succeeded in generating the final mitogenome. I reinstalled MitoHiFi and all its dependencies, and it worked. However, in the MitoFinder log, it says Evidences of circularization could not be found, but everyother step was successful, and it did not find any genes in the mtDNA aside from 8 tRNAs.
final_mitogenome.annotation_MitoFinder.log https://github.com/marcelauliano/MitoHiFi/files/9566571/final_mitogenome.annotation_MitoFinder.log
And below is the circularisation check results (all false) all_contigs.circularisationCheck.txt https://github.com/marcelauliano/MitoHiFi/files/9566589/all_contigs.circularisationCheck.txt
It seems that MitoFinder cannot find the circularisation, do you have any suggestions about this problem?
Many thanks for your help!
— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/24#issuecomment-1246750912, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5REZLCCLMZ4UJXHGM5TV6HF3PANCNFSM56IWFE2Q . You are receiving this because you commented.Message ID: @.***>
-- Marcela Uliano da Silva, PhD
Senior Bioinformatician - Wellcome Sanger Institute Darwin Tree of Life Project
Churchill College Cambridge By-Fellow
Cambridge, UK
Athena, did you input your reads and -r ? This is what you should do. From your intermediate files it seems to me you input them as -c
On Wed, 14 Sep 2022 at 13:37, Marcela Uliano da Silva < @.***> wrote:
Hi Athena! Your problem now is not the pipeline anymore. It’s possible you don’t have mitoreads in your sample. You need to investigate the intermediate files such as the filtered reads. Best of luck. If we can help with anymore more specific, let us know!
On Wed, 14 Sep 2022 at 10:16, Athena Syarifa @.***> wrote:
Hi Marcela, I have succeeded in generating the final mitogenome. I reinstalled MitoHiFi and all its dependencies, and it worked. However, in the MitoFinder log, it says Evidences of circularization could not be found, but everyother step was successful, and it did not find any genes in the mtDNA aside from 8 tRNAs.
final_mitogenome.annotation_MitoFinder.log https://github.com/marcelauliano/MitoHiFi/files/9566571/final_mitogenome.annotation_MitoFinder.log
And below is the circularisation check results (all false) all_contigs.circularisationCheck.txt https://github.com/marcelauliano/MitoHiFi/files/9566589/all_contigs.circularisationCheck.txt
It seems that MitoFinder cannot find the circularisation, do you have any suggestions about this problem?
Many thanks for your help!
— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/24#issuecomment-1246750912, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5REZLCCLMZ4UJXHGM5TV6HF3PANCNFSM56IWFE2Q . You are receiving this because you commented.Message ID: @.***>
-- Marcela Uliano da Silva, PhD
Senior Bioinformatician - Wellcome Sanger Institute Darwin Tree of Life Project
Churchill College Cambridge By-Fellow
Cambridge, UK
-- Marcela Uliano da Silva, PhD
Senior Bioinformatician - Wellcome Sanger Institute Darwin Tree of Life Project
Churchill College Cambridge By-Fellow
Cambridge, UK
Hi Marcela and fellow contributors, Thank you for developing such useful tool! I have been trying to assemble mitogenome of a bird species using raw reads with this command:
But then, I get the following error:
I thought it was because of the same problem as issue #10 , but the command for hifiasm is correct. I have the filtered reads (although only 5) but then the
contigs.blastn
file and all of the hifiasm output files are empty (includinghifiasm.contigs.fasta
). I am not sure why this happened. Can someone help me?This is how my directory looks:
Many thanks, and have a good day! Best, Rifa