marcelauliano / MitoHiFi

Find, circularise and annotate mitogenome from PacBio assemblies
MIT License
169 stars 29 forks source link

problem with get_ref_tRNA #75

Closed AlcaArctica closed 9 months ago

AlcaArctica commented 9 months ago

I have a problem with the provided example: I first obtain the reference (which is different then in the READ.me, but that shouldnt matter, I think):

MitoHiFi/src/findMitoReference.py --species "Deilephila porcellus" --outfolder . --min_length 14000
Looking for mitochondrion for Deilephila porcellus
Looking for an appropriate organelle among Deilephila porcellus
Looking for an appropriate organelle among Deilephila
Looking for an appropriate organelle among Macroglossini;
output is written to ./NC_057303.1.[gb,fasta]

and then try to run mitohifi with the example fasta file:

python MitoHiFi/src/mitohifi.py -r ilDeiPorc1.reads.100.fa -f NC_057303.1.fasta -g NC_057303.1.gb -t 4 -o 5 
2024-02-14 10:30:09 [INFO] Welcome to MitoHifi v2. Starting pipeline...
2024-02-14 10:30:09 [INFO] Length of related mitogenome is: 15312 bp
2024-02-14 10:30:09 [INFO] Number of genes on related mitogenome: 37
2024-02-14 10:30:09 [INFO] Running MitoHifi pipeline in reads mode...
2024-02-14 10:30:09 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome
2024-02-14 10:30:09 [INFO] minimap2 -t 4 --secondary=no -ax map-hifi NC_057303.1.fasta ilDeiPorc1.reads.100.fa | samtools view -@ 4 -b -F4 -F 0x800 -o reads.HiFiMapped.bam
2024-02-14 10:30:10 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS
2024-02-14 10:30:10 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format:
2024-02-14 10:30:10 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta
2024-02-14 10:30:10 [INFO] Total number of mapped reads: 97
2024-02-14 10:30:10 [INFO] 2.2 Then we filter reads that are larger than 15312 bp
2024-02-14 10:30:10 [INFO] Number of filtered reads: 87
2024-02-14 10:30:10 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads!
2024-02-14 10:30:10 [INFO] hifiasm --primary -t 4 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta
cat: gbk.HiFiMapped.bam.filtered.assembled.a_ctg.fa: No such file or directory
2024-02-14 10:30:11 [INFO] 4. Let's run the blast of the contigs versus the close-related mitogenome
2024-02-14 10:30:11 [INFO] 4.1. Creating BLAST database:
2024-02-14 10:30:11 [INFO] makeblastdb -in NC_057303.1.fasta -dbtype nucl
2024-02-14 10:30:18 [INFO] Makeblastdb done.
2024-02-14 10:30:18 [INFO] 4.2. Running blast of contigs against close-related mitogenome:
2024-02-14 10:30:18 [INFO] blastn -query hifiasm.contigs.fasta -db NC_057303.1.fasta -num_threads 4 -out contigs.blastn -outfmt 6 std qlen slen
2024-02-14 10:30:25 [INFO] Blast done.
2024-02-14 10:30:25 [INFO] 5. Filtering BLAST output to select target sequences
2024-02-14 10:30:25 [INFO] Filtering thresholds applied:
2024-02-14 10:30:25 [INFO] Minimum query percentage = 50
2024-02-14 10:30:25 [INFO] Minimum query length = 80% subject length
2024-02-14 10:30:25 [INFO] Maximum query length = 5 times subject length
2024-02-14 10:30:25 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file
2024-02-14 10:30:25 [INFO] 6. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s).
2024-02-14 10:30:25 [INFO] Annotation will be done using MitoFinder (default)
2024-02-14 10:30:25 [INFO] Working with contig ptg000001l
2024-02-14 10:30:25 [INFO] Started ptg000001l circularization
2024-02-14 10:30:41 [INFO] ptg000001l circularization done. Circularization info saved on ./potential_contigs/ptg000001l/ptg000001l.circularisationCheck.txt
2024-02-14 10:30:41 [INFO] Started ptg000001l (MitoFinder) annotation
Traceback (most recent call last):
  File "/lustre/projects/dazzler/uelze/sw/MitoHiFi/src/mitohifi.py", line 566, in <module>
    main()
  File "/lustre/projects/dazzler/uelze/sw/MitoHiFi/src/mitohifi.py", line 303, in main
    tRNA_ref = fetch.get_ref_tRNA()
  File "/lustre/projects/dazzler/uelze/sw/MitoHiFi/src/fetch.py", line 48, in get_ref_tRNA
    reference_tRNA = max(tRNAs, key=tRNAs.get)
ValueError: max() arg is an empty sequence

How do I solve this error?

marcelauliano commented 9 months ago

MitoFinder doesn't seem to be properly installed.

Em qua., 14 de fev. de 2024 às 09:36, Laura Uelze @.***> escreveu:

I have a problem with the provided example: I first obtain the reference (which is different then in the READ.me, but that shouldnt matter, I think):

MitoHiFi/src/findMitoReference.py --species "Deilephila porcellus" --outfolder . --min_length 14000 Looking for mitochondrion for Deilephila porcellus Looking for an appropriate organelle among Deilephila porcellus Looking for an appropriate organelle among Deilephila Looking for an appropriate organelle among Macroglossini; output is written to ./NC_057303.1.[gb,fasta]

and then try to run mitohifi with the example fasta file:

python MitoHiFi/src/mitohifi.py -r ilDeiPorc1.reads.100.fa -f NC_057303.1.fasta -g NC_057303.1.gb -t 4 -o 5 2024-02-14 10:30:09 [INFO] Welcome to MitoHifi v2. Starting pipeline... 2024-02-14 10:30:09 [INFO] Length of related mitogenome is: 15312 bp 2024-02-14 10:30:09 [INFO] Number of genes on related mitogenome: 37 2024-02-14 10:30:09 [INFO] Running MitoHifi pipeline in reads mode... 2024-02-14 10:30:09 [INFO] 1. First we map your Pacbio HiFi reads to the close-related mitogenome 2024-02-14 10:30:09 [INFO] minimap2 -t 4 --secondary=no -ax map-hifi NC_057303.1.fasta ilDeiPorc1.reads.100.fa | samtools view -@ 4 -b -F4 -F 0x800 -o reads.HiFiMapped.bam 2024-02-14 10:30:10 [INFO] 2. Now we filter out any mapped reads that are larger than the reference mitogenome to avoid NUMTS 2024-02-14 10:30:10 [INFO] 2.1 First we convert the mapped reads from BAM to FASTA format: 2024-02-14 10:30:10 [INFO] samtools fasta reads.HiFiMapped.bam > gbk.HiFiMapped.bam.fasta 2024-02-14 10:30:10 [INFO] Total number of mapped reads: 97 2024-02-14 10:30:10 [INFO] 2.2 Then we filter reads that are larger than 15312 bp 2024-02-14 10:30:10 [INFO] Number of filtered reads: 87 2024-02-14 10:30:10 [INFO] 3. Now let's run hifiasm to assemble the mapped and filtered reads! 2024-02-14 10:30:10 [INFO] hifiasm --primary -t 4 -f 0 -o gbk.HiFiMapped.bam.filtered.assembled gbk.HiFiMapped.bam.filtered.fasta cat: gbk.HiFiMapped.bam.filtered.assembled.a_ctg.fa: No such file or directory 2024-02-14 10:30:11 [INFO] 4. Let's run the blast of the contigs versus the close-related mitogenome 2024-02-14 10:30:11 [INFO] 4.1. Creating BLAST database: 2024-02-14 10:30:11 [INFO] makeblastdb -in NC_057303.1.fasta -dbtype nucl 2024-02-14 10:30:18 [INFO] Makeblastdb done. 2024-02-14 10:30:18 [INFO] 4.2. Running blast of contigs against close-related mitogenome: 2024-02-14 10:30:18 [INFO] blastn -query hifiasm.contigs.fasta -db NC_057303.1.fasta -num_threads 4 -out contigs.blastn -outfmt 6 std qlen slen 2024-02-14 10:30:25 [INFO] Blast done. 2024-02-14 10:30:25 [INFO] 5. Filtering BLAST output to select target sequences 2024-02-14 10:30:25 [INFO] Filtering thresholds applied: 2024-02-14 10:30:25 [INFO] Minimum query percentage = 50 2024-02-14 10:30:25 [INFO] Minimum query length = 80% subject length 2024-02-14 10:30:25 [INFO] Maximum query length = 5 times subject length 2024-02-14 10:30:25 [INFO] Filtering BLAST finished. A list of the filtered contigs was saved on ./contigs_filtering/contigs_ids.txt file 2024-02-14 10:30:25 [INFO] 6. Now we are going to circularize, annotate and rotate each filtered contig. Those are potential mitogenome(s). 2024-02-14 10:30:25 [INFO] Annotation will be done using MitoFinder (default) 2024-02-14 10:30:25 [INFO] Working with contig ptg000001l 2024-02-14 10:30:25 [INFO] Started ptg000001l circularization 2024-02-14 10:30:41 [INFO] ptg000001l circularization done. Circularization info saved on ./potential_contigs/ptg000001l/ptg000001l.circularisationCheck.txt 2024-02-14 10:30:41 [INFO] Started ptg000001l (MitoFinder) annotation Traceback (most recent call last): File "/lustre/projects/dazzler/uelze/sw/MitoHiFi/src/mitohifi.py", line 566, in main() File "/lustre/projects/dazzler/uelze/sw/MitoHiFi/src/mitohifi.py", line 303, in main tRNA_ref = fetch.get_ref_tRNA() File "/lustre/projects/dazzler/uelze/sw/MitoHiFi/src/fetch.py", line 48, in get_ref_tRNA reference_tRNA = max(tRNAs, key=tRNAs.get) ValueError: max() arg is an empty sequence

How do I solve this error?

— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/75, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RHIGN7AVZI4FSUAMOTYTSAQRAVCNFSM6AAAAABDH6KXE6VHI2DSMVQWIX3LMV43ASLTON2WKOZSGEZTGOJRGY2DENQ . You are receiving this because you are subscribed to this thread.Message ID: @.***>

-- Marcela Uliano da Silva, PhD

Senior Bioinformatician - Wellcome Sanger Institute

Darwin Tree of Life Project (DToL)

Churchill College Postdoctoral By-Fellow, University of Cambridge

Cambridge, UK

AlcaArctica commented 9 months ago

Hmm, it is a fresh pull of mitohifi (git clone https://github.com/marcelauliano/MitoHiFi.git), and I am running it in the recommended conda env. Is there anything else I am overlooking? I am running MitoHiFi 3.2.1

marcelauliano commented 9 months ago

Please refer to Readme 2.2 You need to install mitofinder separately:)

Marcela Uliano da Silva, PhD

Senior Bioinformatician - Wellcome Sanger Institute

Darwin Tree of Life Project (DToL)

Churchill College Postdoctoral By-Fellow, University of Cambridge

Cambridge, UK

On Wed, 14 Feb 2024 at 10:29, Laura Uelze @.***> wrote:

Hmm, it is a fresh pull of mitohifi (git clone https://github.com/marcelauliano/MitoHiFi.git), and I am running it in the recommended conda env. Is there anything else I am overlooking?

— Reply to this email directly, view it on GitHub https://github.com/marcelauliano/MitoHiFi/issues/75#issuecomment-1943477515, or unsubscribe https://github.com/notifications/unsubscribe-auth/AA7M5RHJXJ4YCTSZ763ECA3YTSGZJAVCNFSM6AAAAABDH6KXE6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSNBTGQ3TONJRGU . You are receiving this because you commented.Message ID: @.***>

AlcaArctica commented 9 months ago

Yes, indeed. I am sorry for overlooking this. Now I installed mitofinder and added it to path the error is gone.