marcelauliano / MitoHiFi

Find, circularise and annotate mitogenome from PacBio assemblies
MIT License
168 stars 27 forks source link

No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s). An error may have occurred when assembling reads with HiFiasm #55

Open madhubioinfo opened 1 year ago

madhubioinfo commented 1 year ago

singularity exec --bind /home/eniac/madhu_mito/:/home/eniac/madhu_mito docker://ghcr.io/marcelauliano/mitohifi:master mitohifi.py -r /home/eniac/madhu_mito/A5_mitochondria_genome.fastq -f /home/eniac/madhu_mito/Rhizophagus_irregularis_DAOM197198_mtDNA.fasta -g /home/eniac/madhu_mito/Rhizophagus_irregularis_DAOM197198_mtDNA.gb -t 15 -o 1

INFO: Using cached SIF image 2023-06-26 11:09:43 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2023-06-26 11:09:43 [INFO] Length of related mitogenome is: 70793 bp 2023-06-26 11:09:43 [INFO] Number of genes on related mitogenome: 46 2023-06-26 11:09:43 [INFO] Running MitoHifi pipeline in reads mode... 2023-06-26 11:09:43 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome 2023-06-26 11:09:43 [INFO] minimap2 -t 15 --secondary=no -ax map-hifi /home/eniac/madhu_mito/Rhizophagus_irregularis_DAOM197198_mtDNA.fasta /home/eniac/madhu_mito/A5_mitochondria_genome.fastq | samtools view -@ 15 -b -F4 -F 0x800 -o reads.HiFiMapped.bam 2023-06-26 11:10:05 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS 2023-06-26 11:10:05 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format: 2023-06-26 11:10:05 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta 2023-06-26 11:10:12 [INFO] Total number of mapped reads: 22517 2023-06-26 11:10:12 [INFO] 2.2 Then we filter reads that are larger than 70793 bp 2023-06-26 11:10:17 [INFO] Number of filtered reads: 22517 2023-06-26 11:10:17 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads! 2023-06-26 11:10:17 [INFO] hifiasm --primary -t 15 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s). An error may have occurred when assembling reads with HiFiasm.

pdoris commented 1 year ago

same problem here

apptainer exec --bind /work/06127/pdoris/Mitosif:/work/06127/pdoris/Mitosif docker://ghcr.io/marcelauliano/mitohifi:master mitohifi.py -r /work/06127/pdoris/SHRA3_HiFi_reads/cell3.fa -f /work/06127/pdoris/MitoHiFi/data/NC_001665.2.fasta -g /work/06127/pdoris/MitoHiFi/data/RNmitosequence.gbk -o 2 -t 128 INFO: Using cached SIF image 2023-06-26 11:28:22 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2023-06-26 11:28:22 [INFO] Length of related mitogenome is: 16313 bp 2023-06-26 11:28:22 [INFO] Number of genes on related mitogenome: 37 2023-06-26 11:28:22 [INFO] Running MitoHifi pipeline in reads mode... 2023-06-26 11:28:22 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome 2023-06-26 11:28:22 [INFO] minimap2 -t 128 --secondary=no -ax map-hifi /work/06127/pdoris/MitoHiFi/data/NC_001665.2.fasta /work/06127/pdoris/SHRA3_HiFi_reads/cell3.fa | samtools view -@ 128 -b -F4 -F 0x800 -o reads.HiFiMapped.bam 2023-06-26 11:29:29 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS 2023-06-26 11:29:29 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format: 2023-06-26 11:29:29 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta 2023-06-26 11:29:29 [INFO] Total number of mapped reads: 23 2023-06-26 11:29:29 [INFO] 2.2 Then we filter reads that are larger than 16313 bp 2023-06-26 11:29:29 [INFO] Number of filtered reads: 19 2023-06-26 11:29:29 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads! 2023-06-26 11:29:29 [INFO] hifiasm --primary -t 128 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s). An error may have occurred when assembling reads with HiFiasm.

pdoris commented 1 year ago

Hifiasm is generating some output files -rw------- 1 pdoris G-820831 25061649697 Jun 26 09:18 cell3.fa -rw-r--r-- 1 pdoris G-820831 303353 Jun 26 11:29 gbk.HiFiMapped.bam.fasta -rw-r--r-- 1 pdoris G-820831 8052 Jun 26 10:27 gbk.HiFiMapped.bam.filtered.assembled.ec.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:27 gbk.HiFiMapped.bam.filtered.assembled.ovlp.reverse.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:27 gbk.HiFiMapped.bam.filtered.assembled.ovlp.source.bin -rw-r--r-- 1 pdoris G-820831 234024 Jun 26 11:29 gbk.HiFiMapped.bam.filtered.fasta -rw-r--r-- 1 pdoris G-820831 216 Jun 26 11:29 hifiasm.log -rw-r--r-- 1 pdoris G-820831 83447 Jun 26 11:29 reads.HiFiMapped.bam

marcelauliano commented 1 year ago

Hi all, I just ran an implementation test today with the code that is there at it passed fine. But let me have a look at it tomorrow! Best, M

On Mon, 26 Jun 2023 at 17:38, pdoris @.***> wrote:

Hifiasm is generating some output files -rw------- 1 pdoris G-820831 25061649697 Jun 26 09:18 cell3.fa -rw-r--r-- 1 pdoris G-820831 303353 Jun 26 11:29 gbk.HiFiMapped.bam.fasta -rw-r--r-- 1 pdoris G-820831 8052 Jun 26 10:27 gbk.HiFiMapped.bam.filtered.assembled.ec.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:27 gbk.HiFiMapped.bam.filtered.assembled.ovlp.reverse.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:27 gbk.HiFiMapped.bam.filtered.assembled.ovlp.source.bin -rw-r--r-- 1 pdoris G-820831 234024 Jun 26 11:29 gbk.HiFiMapped.bam.filtered.fasta -rw-r--r-- 1 pdoris G-820831 216 Jun 26 11:29 hifiasm.log -rw-r--r-- 1 pdoris G-820831 83447 Jun 26 11:29 reads.HiFiMapped.bam

— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/55#issuecomment-1607837858, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RE6IRNK3L6EMFJ7HILXNG3HTANCNFSM6AAAAAAZUJ7WMU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Marcela Uliano da Silva, PhD

Senior Bioinformatician - Wellcome Sanger Institute

Darwin Tree of Life Project (DToL)

Churchill College Postdoctoral By-Fellow, University of Cambridge

Cambridge, UK

pdoris commented 1 year ago

Hi Marcela

I begin to think there are no non-nuclear mitogenome sequences in my reads. I have also run in contig mode. Mitofinder gives the reported error

Traceback (most recent call last): File "/bin/MitoHiFi/mitohifi.py", line 377, in main() File "/bin/MitoHiFi/mitohifi.py", line 263, in main tRNA_ref = fetch.get_ref_tRNA() File "/bin/MitoHiFi/fetch.py", line 40, in get_ref_tRNA reference_tRNA = max(tRNAs, key=tRNAs.get) ValueError: max() arg is an empty sequence

When I run the contigs in singularity, I also get an error that suggests no mitogenome reads were found

apptainer exec --bind /work/06127/pdoris/Mitosif:/work/06127/pdoris/Mitosif docker://ghcr.io/marcelauliano/mitohifi:master mitohifi.py -c /work/06127/pdoris/stampede2/SHRSP_BbbUtx_1.curated.fa -f /work/06127/pdoris/MitoHiFi/data/NC_001665.2.fasta -g /work/06127/pdoris/MitoHiFi/data/RNmitosequence.gb -o 2 -t 16

INFO: Using cached SIF image

2023-06-25 16:27:56 [INFO] Welcome to MitoHiFi! Starting pipeline...

2023-06-25 16:27:56 [INFO] Length of related mitogenome is: 16313 bp

2023-06-25 16:27:56 [INFO] Number of genes on related mitogenome: 37

2023-06-25 16:27:56 [INFO] Running MitoHifi pipeline in contigs mode...

2023-06-25 16:27:56 [INFO] 1. Fixing potentially conflicting FASTA headers

2023-06-25 16:28:59 [INFO] 2. Let's run the blast of the contigs versus the close-related mitogenome

2023-06-25 16:28:59 [INFO] 2.1. Creating BLAST database:

2023-06-25 16:28:59 [INFO] makeblastdb -in /work/06127/pdoris/MitoHiFi/data/NC_001665.2.fasta -dbtype nucl

2023-06-25 16:29:00 [INFO] Makeblastdb done.

2023-06-25 16:29:00 [INFO] 2.2. Running blast of contigs against close-related mitogenome:

2023-06-25 16:29:00 [INFO] blastn -query /work/06127/pdoris/stampede2/SHRSP_BbbUtx_1.curated.fa -db /work/06127/pdoris/MitoHiFi/data/NC_001665.2.fasta -num_threads 16 -out contigs.blastn -outfmt 6 std qlen slen

2023-06-25 16:40:31 [INFO] Blast done.

2023-06-25 16:40:31 [INFO] 3. Filtering BLAST output to select target sequences

2023-06-25 16:40:31 [INFO] Filtering thresholds applied:

2023-06-25 16:40:31 [INFO] Minimum query percentage = 50

2023-06-25 16:40:31 [INFO] Minimum query length = 80% subject length

2023-06-25 16:40:31 [INFO] Maximum query length = 5 times subject length

Attention!

'parsed_blast.txt' and 'parsed_blast_all.txt' files are empty.

The pipeline has stopped !! You need to run further scripts to check if you have mito reads pulled to a large NUMT!

I have three more files of reads that I will check.

Peter

From: Marcela Uliano-Silva @.> Reply-To: marcelauliano/MitoHiFi @.> Date: Monday, June 26, 2023 at 12:52 PM To: marcelauliano/MitoHiFi @.> Cc: "Doris, Peter A" @.>, Comment @.***> Subject: Re: [marcelauliano/MitoHiFi] No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s). An error may have occurred when assembling reads with HiFiasm (Issue #55)

External: Increase caution when handling links and attachments.

Hi all, I just ran an implementation test today with the code that is there at it passed fine. But let me have a look at it tomorrow! Best, M

On Mon, 26 Jun 2023 at 17:38, pdoris @.***> wrote:

Hifiasm is generating some output files -rw------- 1 pdoris G-820831 25061649697 Jun 26 09:18 cell3.fa -rw-r--r-- 1 pdoris G-820831 303353 Jun 26 11:29 gbk.HiFiMapped.bam.fasta -rw-r--r-- 1 pdoris G-820831 8052 Jun 26 10:27 gbk.HiFiMapped.bam.filtered.assembled.ec.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:27 gbk.HiFiMapped.bam.filtered.assembled.ovlp.reverse.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:27 gbk.HiFiMapped.bam.filtered.assembled.ovlp.source.bin -rw-r--r-- 1 pdoris G-820831 234024 Jun 26 11:29 gbk.HiFiMapped.bam.filtered.fasta -rw-r--r-- 1 pdoris G-820831 216 Jun 26 11:29 hifiasm.log -rw-r--r-- 1 pdoris G-820831 83447 Jun 26 11:29 reads.HiFiMapped.bam

— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/55#issuecomment-1607837858, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RE6IRNK3L6EMFJ7HILXNG3HTANCNFSM6AAAAAAZUJ7WMU . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Marcela Uliano da Silva, PhD

Senior Bioinformatician - Wellcome Sanger Institute

Darwin Tree of Life Project (DToL)

Churchill College Postdoctoral By-Fellow, University of Cambridge

Cambridge, UK

— Reply to this email directly, view it on GitHubhttps://github.com/marcelauliano/MitoHiFi/issues/55#issuecomment-1607953106, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUEQOHGKCQE4BRUIXOXMAALXNHD6PANCNFSM6AAAAAAZUJ7WMU. You are receiving this because you commented.Message ID: @.***>

pdoris commented 1 year ago

Being concerned that there might be no non-nuclear mitochondrial genome reads discovered in the previous run, I expanded the read files to combine data from all 4 HiFi cells that were run on this genome (previously I ran only one cell of reads).

apptainer exec --bind /work/06127/pdoris/Mitosif:/work/06127/pdoris/Mitosif docker://ghcr.io/marcelauliano/mitohifi:master mitohifi.py -r /work/06127/pdoris/SHRA3_HiFi_reads/A3HiFireads.fa.gz -f /work/06127/pdoris/MitoHiFi/data/NC_001665.2.fasta -g /work/06127/pdoris/MitoHiFi/data/RNmitosequence.gbk -o 2 -t 128 INFO: Using cached SIF image 2023-06-27 08:37:38 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2023-06-27 08:37:38 [INFO] Length of related mitogenome is: 16313 bp 2023-06-27 08:37:38 [INFO] Number of genes on related mitogenome: 37 2023-06-27 08:37:38 [INFO] Running MitoHifi pipeline in reads mode... 2023-06-27 08:37:38 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome 2023-06-27 08:37:38 [INFO] minimap2 -t 128 --secondary=no -ax map-hifi /work/06127/pdoris/MitoHiFi/data/NC_001665.2.fasta /work/06127/pdoris/SHRA3_HiFi_reads/A3HiFireads.fa.gz | samtools view -@ 128 -b -F4 -F 0x800 -o reads.HiFiMapped.bam 2023-06-27 08:46:01 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS 2023-06-27 08:46:01 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format: 2023-06-27 08:46:01 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta 2023-06-27 08:46:01 [INFO] Total number of mapped reads: 63 2023-06-27 08:46:01 [INFO] 2.2 Then we filter reads that are larger than 16313 bp 2023-06-27 08:46:01 [INFO] Number of filtered reads: 53 2023-06-27 08:46:01 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads! 2023-06-27 08:46:01 [INFO] hifiasm --primary -t 128 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s). An error may have occurred when assembling reads with HiFiasm.

-rw-r--r-- 1 pdoris G-820831 800081 Jun 27 08:46 gbk.HiFiMapped.bam.fasta -rw-r--r-- 1 pdoris G-820831 8052 Jun 26 10:29 gbk.HiFiMapped.bam.filtered.assembled.ec.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:29 gbk.HiFiMapped.bam.filtered.assembled.ovlp.reverse.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:29 gbk.HiFiMapped.bam.filtered.assembled.ovlp.source.bin -rw-r--r-- 1 pdoris G-820831 625708 Jun 27 08:46 gbk.HiFiMapped.bam.filtered.fasta -rw-r--r-- 1 pdoris G-820831 216 Jun 27 08:46 hifiasm.log -rw------- 1 pdoris G-820831 16866 Jun 22 16:08 NC_001665.2.fasta -rw-r--r-- 1 pdoris G-820831 143 Jun 25 17:20 NC_001665.2.fasta.nhr -rw-r--r-- 1 pdoris G-820831 128 Jun 25 17:20 NC_001665.2.fasta.nin -rw-r--r-- 1 pdoris G-820831 4080 Jun 25 17:20 NC_001665.2.fasta.nsq -rw-r--r-- 1 pdoris G-820831 222656 Jun 27 08:46 reads.HiFiMapped.bam -rw------- 1 pdoris G-820831 51100 Jun 26 10:28 RNmitosequence.gbk

(base) c303-006.ls6(1010)$ more hifiasm.log Reads has been loaded. Loading ma_hit_ts from disk... ma_hit_ts has been read. Loading ma_hit_ts from disk... ma_hit_ts has been read. [M::ha_assemble::0.151*0.03] ==> loaded corrected reads and overlaps from disk

pdoris commented 1 year ago

I ran the container using the provided test data:

apptainer exec --bind /work/06127/pdoris/Mitosif:/work/06127/pdoris/Mitosif docker://ghcr.io/marcelauliano/mitohifi:master mitohifi.py -r /work/06127/pdoris/MitoHiFi/tests/ilDeiPorc1.reads.100.fa -f /work/06127/pdoris/MitoHiFi/data/OQ694980.1.fasta -g /work/06127/pdoris/MitoHiFi/data/OQ694980.1.gb -o 2 -t 128 INFO: Using cached SIF image 2023-06-27 09:25:35 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2023-06-27 09:25:35 [INFO] Length of related mitogenome is: 15372 bp 2023-06-27 09:25:35 [INFO] Number of genes on related mitogenome: 37 2023-06-27 09:25:35 [INFO] Running MitoHifi pipeline in reads mode... 2023-06-27 09:25:35 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome 2023-06-27 09:25:35 [INFO] minimap2 -t 128 --secondary=no -ax map-hifi /work/06127/pdoris/MitoHiFi/data/OQ694980.1.fasta /work/06127/pdoris/MitoHiFi/tests/ilDeiPorc1.reads.100.fa | samtools view -@ 128 -b -F4 -F 0x800 -o reads.HiFiMapped.bam 2023-06-27 09:25:35 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS 2023-06-27 09:25:35 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format: 2023-06-27 09:25:35 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta 2023-06-27 09:25:35 [INFO] Total number of mapped reads: 97 2023-06-27 09:25:35 [INFO] 2.2 Then we filter reads that are larger than 15372 bp 2023-06-27 09:25:35 [INFO] Number of filtered reads: 97 2023-06-27 09:25:35 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads! 2023-06-27 09:25:35 [INFO] hifiasm --primary -t 128 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s). An error may have occurred when assembling reads with HiFiasm.

more hifiasm.log Reads has been loaded. Loading ma_hit_ts from disk... ma_hit_ts has been read. Loading ma_hit_ts from disk... ma_hit_ts has been read. [M::ha_assemble::0.008*0.41] ==> loaded corrected reads and overlaps from disk

-rw-r--r-- 1 pdoris G-820831 849022 Jun 27 09:25 gbk.HiFiMapped.bam.fasta -rw-r--r-- 1 pdoris G-820831 863066 Jun 27 09:25 gbk.HiFiMapped.bam.filtered.fasta -rw-r--r-- 1 pdoris G-820831 216 Jun 27 09:25 hifiasm.log -rw-r--r-- 1 pdoris G-820831 193496 Jun 27 09:25 reads.HiFiMapped.bam drwx------ 2 pdoris G-820831 4096 Jun 27 09:20 . -rw------- 1 pdoris G-820831 15877 Jun 27 09:20 OQ694980.1.fasta -rw------- 1 pdoris G-820831 36429 Jun 27 09:20 OQ694980.1.gb -rw-r--r-- 1 pdoris G-820831 8052 Jun 26 10:29 gbk.HiFiMapped.bam.filtered.assembled.ec.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:29 gbk.HiFiMapped.bam.filtered.assembled.ovlp.reverse.bin -rw-r--r-- 1 pdoris G-820831 8 Jun 26 10:29 gbk.HiFiMapped.bam.filtered.assembled.ovlp.source.bin -rw------- 1 pdoris G-820831 51100 Jun 26 10:28 RNmitosequence.gbk -rw-r--r-- 1 pdoris G-820831 143 Jun 25 17:20 NC_001665.2.fasta.nhr -rw-r--r-- 1 pdoris G-820831 128 Jun 25 17:20 NC_001665.2.fasta.nin -rw-r--r-- 1 pdoris G-820831 4080 Jun 25 17:20 NC_001665.2.fasta.nsq drwx------ 10 pdoris G-820831 4096 Jun 22 16:30 .. -rw------- 1 pdoris G-820831 16866 Jun 22 16:08 NC_001665.2.fasta

marcelauliano commented 1 year ago

Hi pdoris, I just pulled the docker again and ran it, and it all worked fine for me. Not sure what is going on? Which singularity version you are using?

marcelauliano commented 1 year ago

Two things I see: your -o should be 5 (see --help message), and you are giving 128 threads to your run? -t 1 for the test should be enough

pdoris commented 1 year ago

Thanks, Marcela

I am using apptainer

I will drop t to 1 and run again

I wonder if apptainer is fully compatible with the docker container

https://apptainer.org/docs/user/main/docker_and_oci.html#differences-and-limitations-vs-docker

Sent from my iPad

On Jun 28, 2023, at 3:18 AM, Marcela Uliano-Silva @.***> wrote:

 External: Increase caution when handling links and attachments.

Two things I see: your -o should be 5 (see --help message), and you are giving 128 threads to your run? -t 1 should be enough

— Reply to this email directly, view it on GitHubhttps://github.com/marcelauliano/MitoHiFi/issues/55#issuecomment-1610973480, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUEQOHFKGWNMXE7CIDLFO3TXNPSEHANCNFSM6AAAAAAZUJ7WMU. You are receiving this because you commented.Message ID: @.***>

madhubioinfo commented 1 year ago

I still have the same error from my command

INFO: Using cached SIF image 2023-06-28 09:21:05 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2023-06-28 09:21:05 [INFO] Length of related mitogenome is: 70793 bp 2023-06-28 09:21:05 [INFO] Number of genes on related mitogenome: 46 2023-06-28 09:21:05 [INFO] Running MitoHifi pipeline in reads mode... 2023-06-28 09:21:05 [INFO] 1. First we map your Pacbio HiFi reads to the close-r elated mitogenome 2023-06-28 09:21:05 [INFO] minimap2 -t 15 --secondary=no -ax map-hifi /home/enia c/madhu_mito/Rhizophagus_irregularis_DAOM197198_mtDNA.fasta /home/eniac/madhu_mi to/A5_mitochondria_genome.fastq | samtools view -@ 15 -b -F4 -F 0x800 -o reads.H iFiMapped.bam 2023-06-28 09:21:27 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS 2023-06-28 09:21:27 [INFO] 2.1 First we convert the mapped reads from BAM to FAS TA format: 2023-06-28 09:21:27 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped. bam.fasta 2023-06-28 09:21:33 [INFO] Total number of mapped reads: 22517 2023-06-28 09:21:33 [INFO] 2.2 Then we filter reads that are larger than 70793 b p 2023-06-28 09:21:38 [INFO] Number of filtered reads: 22517 2023-06-28 09:21:38 [INFO] 3. Now let's run hifiasm to assemble the mapped and f iltered reads! 2023-06-28 09:21:38 [INFO] hifiasm --primary -t 15 -f 0 -o gbk.HiFiMapped.bam.fi ltered.assembled gbk.HiFiMapped.bam.filtered.fasta No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s). An error may have occurred when assembling reads with HiFiasm.

pdoris commented 1 year ago

My genome is a mammal, so -o 2 is correct unless my eyes are really tricking me

More info on our singularity implementation


tacc-apptainer: tacc-apptainer/1.1.8


Description:

  Application and environment virtualization

This module can be loaded directly: module load tacc-apptainer/1.1.8

Help:

  Apptainer is not installed and should not be run on the login nodes.

  A functional Apptainer module is available on the compute nodes. Submit

  Apptainer job scripts to the queue with 'sbatch'. If you would like to run

  Apptainer interactively, please start an interactive session with 'idev'.

  [login]\$ idev

  [compute]\$ apptainer run container.img

  #############################################################################

  Images and layers are now cached to $STOCKYARD2/apptainer_cache. All images

  created with apptainer pull will be deposited to that location, and can

  only be controlled by changing the cache location. We recommend running with

  the container url

    apptainer exec docker://ubuntu:xenial echo "This works"

  or copying the pulled container to a different location

    apptainer pull docker://ubuntu:xenial

    cp $STOCKYARD2/apptainer_cache/ubuntu-xenial.simg $SCRATCH/

    apptainer exec $SCRATCH/ubuntu-xenial.simg echo "This also works"

  #############################################################################

                 OverlayFS is disabled on tacc-apptainer

  tacc-apptainer utilizes the more secure underlay method automatically mount

  shared filesystems and any user-specified locations. This means you no longer

  need to include directories like

    /work and /scratch

  in your images, but still cannot utilize advanced overlay-specific features.

  #############################################################################

  Additional Documentation

  - Apptainer Main - https://apptainer.org<https://apptainer.org/>

  - ***@***.***  - https://containers-at-tacc.readthedocs.io/en/latest/

  - MPI with Apptainer - https://github.com/TACC/tacc-containers

  Version 1.1.8

From: Marcela Uliano-Silva @.> Reply-To: marcelauliano/MitoHiFi @.> Date: Wednesday, June 28, 2023 at 3:18 AM To: marcelauliano/MitoHiFi @.> Cc: "Doris, Peter A" @.>, Comment @.***> Subject: Re: [marcelauliano/MitoHiFi] No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s). An error may have occurred when assembling reads with HiFiasm (Issue #55)

External: Increase caution when handling links and attachments.

Two things I see: your -o should be 5 (see --help message), and you are giving 128 threads to your run? -t 1 should be enough

— Reply to this email directly, view it on GitHubhttps://github.com/marcelauliano/MitoHiFi/issues/55#issuecomment-1610973480, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AUEQOHFKGWNMXE7CIDLFO3TXNPSEHANCNFSM6AAAAAAZUJ7WMU. You are receiving this because you commented.Message ID: @.***>

madhubioinfo commented 1 year ago

This is the command I am using still persistent with the error message singularity exec --bind /home/eniac/madhu_mito/:/home/eniac/madhu_mito docker://ghcr.io/marcelauliano/mitohifi:master mitohifi.py -r /home/eniac/madhu_mito/A5_mitochondria_genome.fastq -f /home/eniac/madhu_mito/Rhizophagus_irregularis_DAOM197198_mtDNA.fasta -g /home/eniac/madhu_mito/Rhizophagus_irregularis_DAOM197198_mtDNA.gb -t 15 -o 1

SamCT commented 7 months ago

This seems to still be an issue on both test sets (reads and contigs). I have tried with a singularity install and a conda dependency install, same exact error.

(/data/bin/mitohifiENV) bash-4.2$ python3 src/mitohifi.py -r tests/ilPhaBuce1_contig.fa -f test2/NC_016067.1.fasta -g test2/NC_016067.1.gb -t 1 -o 5 2024-03-04 03:33:23 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2024-03-04 03:33:23 [INFO] Length of related mitogenome is: 15659 bp 2024-03-04 03:33:23 [INFO] Number of genes on related mitogenome: 37 2024-03-04 03:33:23 [INFO] Running MitoHifi pipeline in reads mode... 2024-03-04 03:33:23 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome 2024-03-04 03:33:23 [INFO] minimap2 -t 1 --secondary=no -ax map-hifi test2/NC_016067.1.fasta tests/ilPhaBuce1_contig.fa | samtools view -@ 1 -b -F4 -F 0x800 -o reads.HiFiMapped.bam 2024-03-04 03:33:23 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS 2024-03-04 03:33:23 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format: 2024-03-04 03:33:23 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta 2024-03-04 03:33:23 [INFO] Total number of mapped reads: 1 2024-03-04 03:33:23 [INFO] 2.2 Then we filter reads that are larger than 15659 bp 2024-03-04 03:33:23 [INFO] Number of filtered reads: 0 2024-03-04 03:33:23 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads! 2024-03-04 03:33:23 [INFO] hifiasm --primary -t 1 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta cat: gbk.HiFiMapped.bam.filtered.assembled.a_ctg.fa: No such file or directory 2024-03-04 03:33:23 [INFO] 4. Let's run the blast of the contigs versus the close-related mitogenome 2024-03-04 03:33:23 [INFO] 4.1. Creating BLAST database: 2024-03-04 03:33:23 [INFO] makeblastdb -in test2/NC_016067.1.fasta -dbtype nucl 2024-03-04 03:33:24 [INFO] Makeblastdb done. 2024-03-04 03:33:24 [INFO] 4.2. Running blast of contigs against close-related mitogenome: 2024-03-04 03:33:24 [INFO] blastn -query hifiasm.contigs.fasta -db test2/NC_016067.1.fasta -num_threads 1 -out contigs.blastn -outfmt 6 std qlen slen 2024-03-04 03:33:24 [INFO] Blast done. 2024-03-04 03:33:24 [INFO] 5. Filtering BLAST output to select target sequences 2024-03-04 03:33:24 [INFO] Filtering thresholds applied: 2024-03-04 03:33:24 [INFO] Minimum query percentage = 50 2024-03-04 03:33:24 [INFO] Minimum query length = 80% subject length 2024-03-04 03:33:24 [INFO] Maximum query length = 5 times subject length 2024-03-04 03:33:24 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file 2024-03-04 03:33:24 [INFO] 6. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s). 2024-03-04 03:33:24 [INFO] Annotation will be done using MitoFinder (default) 2024-03-04 03:33:25 [INFO] Working with contig ptg000001l 2024-03-04 03:33:25 [INFO] Started ptg000001l circularization 2024-03-04 03:33:26 [INFO] ptg000001l circularization done. Circularization info saved on ./potential_contigs/ptg000001l/ptg000001l.circularisationCheck.txt 2024-03-04 03:33:26 [INFO] Started ptg000001l (MitoFinder) annotation Traceback (most recent call last): File "src/mitohifi.py", line 565, in main() File "src/mitohifi.py", line 302, in main tRNA_ref = fetch.get_ref_tRNA() File "/data/bin/MitoHF2/MitoHiFi/src/fetch.py", line 48, in get_ref_tRNA reference_tRNA = max(tRNAs, key=tRNAs.get) ValueError: max() arg is an empty sequence

Please let me know how I can address this issue, I am very keen to use this tool for our pipelines. Thanks

baileyjc commented 3 months ago

Hello all,

Thank you providing this great package @marcelauliano! I was wondering if the issue is still persisting for the rest of you all. I ran into the same issue today. I am using singularity/3.8.3 on a cluster. My organism is a vertebrate so -o is 2 and all files for the script are located in output_dir. I ran the following code:

#!/bin/bash

# Assuming your files are in the current directory
output_dir="/my/path/4.created_mtDNA/"
mitofinder="/my/path/mitohifi/mitohifi_master.sif"

# Execute your processing script
      singularity run --home $output_dir \
        $mitofinder mitohifi.py \
        -r all_reads.fasta \
        -f AP018927.1.fasta \
        -g AP018927.1.gb \
        -o 2 -t 28

And received the same error as the OP mentioned as well as others:

No gbk.HiFiMapped.bam.filtered.assembled.[a/p]_ctg.gfa file(s).
            An error may have occurred when assembling reads with HiFiasm.

I downloaded MitoHifi using the following command to get the sif image: singularity pull docker://ghcr.io/marcelauliano/mitohifi:latest I did not run into any errors in downloading the package.

Please let me know if there is a solution for this issue and I apologize if it is an unforeseen error on my part.