I've been battling with an issue that I'm sure it has already been addressed at some point in other posts on this page. However I would like to understand why my analyses are failing the way they are doing it. I am working towards assembling a mitogenome using the latest docker image of MitoHiFi. I have run the tests and they run perfectly, with no apparent issues.
I have been trying different mitogenome references to assemble the mitogenome of my species, although there are barely any to choose from. When I use references from the same genus, MitoHiFi is able to assemble a mitogenome, but the annotation is inaccurate and via performing alignments I was able to corroborate that the assembly was the problem in itself. There is a huge block which is not well assembled. Hence, I tried using a different mitogenome reference, although this time from a different closely related genus. Even though the reference is not from the same genus of my sample, the annotation of said mitogenome is more accurate and complete. But then I run into the following error:
../../../opt/MitoHiFi/src/mitohifi.py -r ../../Dxero_readsPB.renamed.fastq.gz -f ../JX398125.1.fasta -g ../JX398125.1.gb -t 4 -o 9
2024-10-03 09:56:19 [INFO] Welcome to MitoHifi v2. Starting pipeline...
2024-10-03 09:56:19 [INFO] Length of related mitogenome is: 27133 bp
2024-10-03 09:56:19 [INFO] Number of genes on related mitogenome: 36
2024-10-03 09:56:19 [INFO] Running MitoHifi pipeline in reads mode...
2024-10-03 09:56:19 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome
2024-10-03 09:56:19 [INFO] minimap2 -t 4 --secondary=no -ax map-hifi ../JX398125.1.fasta ../../Dxero_readsPB.renamed.fastq.gz | samtools view -@ 4 -b -F4 -F 0x800 -o reads.HiFiMapped.bam
2024-10-03 10:13:03 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS
2024-10-03 10:13:03 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format:
2024-10-03 10:13:03 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta
2024-10-03 10:13:04 [INFO] Total number of mapped reads: 7
2024-10-03 10:13:04 [INFO] 2.2 Then we filter reads that are larger than 27133 bp
2024-10-03 10:13:04 [INFO] Number of filtered reads: 7
2024-10-03 10:13:04 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads!
2024-10-03 10:13:04 [INFO] hifiasm --primary -t 4 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta
cat: gbk.HiFiMapped.bam.filtered.assembled.p_ctg.fa: No such file or directory
cat: gbk.HiFiMapped.bam.filtered.assembled.a_ctg.fa: No such file or directory
2024-10-03 10:18:45 [INFO] 4. Let's run the blast of the contigs versus the close-related mitogenome
2024-10-03 10:18:45 [INFO] 4.1. Creating BLAST database:
2024-10-03 10:18:45 [INFO] makeblastdb -in ../JX398125.1.fasta -dbtype nucl
2024-10-03 10:18:48 [INFO] Makeblastdb done.
2024-10-03 10:18:48 [INFO] 4.2. Running blast of contigs against close-related mitogenome:
2024-10-03 10:18:48 [INFO] blastn -query hifiasm.contigs.fasta -db ../JX398125.1.fasta -num_threads 4 -out contigs.blastn -outfmt 6 std qlen slen
Warning: [blastn] Query is Empty!
2024-10-03 10:18:48 [INFO] Blast done.
2024-10-03 10:18:48 [INFO] 5. Filtering BLAST output to select target sequences
2024-10-03 10:18:48 [INFO] Filtering thresholds applied:
2024-10-03 10:18:48 [INFO] Minimum query percentage = 50
2024-10-03 10:18:48 [INFO] Minimum query length = 80% subject length
2024-10-03 10:18:48 [INFO] Maximum query length = 5 times subject length
Attention!
'parsed_blast.txt' and 'parsed_blast_all.txt' files are empty.
The pipeline has stopped !! You need to run further scripts to check if you have mito reads pulled to a large NUMT!
The hifiasm.log and contigs.blastn files are empty, as well as the hifiasm.contigs.fasta and all the gbk.* files. What baffles me is that this same mitogenome reference was used in a previous run in the cluster we often work with (not using docker or singularity) and it manage to complete the assembly (although only tRNAs were annotated).
Hi there!
I've been battling with an issue that I'm sure it has already been addressed at some point in other posts on this page. However I would like to understand why my analyses are failing the way they are doing it. I am working towards assembling a mitogenome using the latest docker image of MitoHiFi. I have run the tests and they run perfectly, with no apparent issues.
I have been trying different mitogenome references to assemble the mitogenome of my species, although there are barely any to choose from. When I use references from the same genus, MitoHiFi is able to assemble a mitogenome, but the annotation is inaccurate and via performing alignments I was able to corroborate that the assembly was the problem in itself. There is a huge block which is not well assembled. Hence, I tried using a different mitogenome reference, although this time from a different closely related genus. Even though the reference is not from the same genus of my sample, the annotation of said mitogenome is more accurate and complete. But then I run into the following error:
The hifiasm.log and contigs.blastn files are empty, as well as the hifiasm.contigs.fasta and all the gbk.* files. What baffles me is that this same mitogenome reference was used in a previous run in the cluster we often work with (not using docker or singularity) and it manage to complete the assembly (although only tRNAs were annotated).
Has anyone run into similar issues?
Many thanks!