RemiAllio / MitoFinder

MitoFinder: efficient automated large-scale extraction of mitogenomic data from high throughput sequencing data
86 stars 14 forks source link

Not finding nucleotide sequence in reference file #48

Closed teaganme closed 1 year ago

teaganme commented 1 year ago

Hello! I am running MitoFinder on linux, using the bioconda install, all the program files load fine, but then I get the error: "ERROR: MitoFinder didn't found any nucleotide sequence in the reference(s) file(s)." in the log. I downloaded the reference from genbank, (https://www.ncbi.nlm.nih.gov/nuccore/OX608057.1) and the file looks to be in genbank format, I checked for any odd whitespace characters, on the offchance my upload to the super changed something, but couldn't find anything. I'm sure I am missing something simple, but I am fairly new to all of this, so any help would be much appreciated! Thank you~ Teagan

RemiAllio commented 1 year ago

Hi Teagan,

Thank you for contacting me!

First of all, as a little warning, I know that there is a conda environment available but I didn't create it.. So I don't know if it works properly. I would recommend that you use the singularity container of MitoFinder. This is by far the easiest way to avoid dependency errors etc..

Nevertheless, in your case, the reference is indeed in the good format, but the mitochondrion is not annotated. MitoFinder looks for annotated genes and cannot find them. I would recommend to use more than one reference as described here using "Asilinae" as keyword. I quickly checked and it seems that only two Asilinae mitogenomes are annotated : -Clephydroneura sp -Satanas sp

I've created the reference.gb file you can use in your case.

Let me know if it helps! Cheers, Rémi

teaganme commented 1 year ago

Thank you, I don't know how I missed the annotation requirement, but I'll be sure to move forward with these. With that change, I'm now getting the "ERROR: Gene named "C0X1" in the reference file(s) is not recognized by MitoFinder" which I see someone else had already brought up, and was solved by the Singularity container. My institution seems hesitant on Singularity, but I'll bother them some more and see if we can come to a solution. Thank you!

teaganme commented 1 year ago

Hi again! So, I was able to do a workaround for Singularity, since my institution won't allow it on the system. Essentially we followed the Linux instructions (with autoconfig, automake, then installing, and adding to Path), then using mamba to install mitofinder as an environment. I am still getting the "ERROR: Gene named "X" in the reference file(s) is not recognized", and I'm using the reference you provided. Any insights? I attatched my log as well. 2005_MitoFinder.log

RemiAllio commented 1 year ago

Hi,

Before digging into it, could you please try to replace the first line of mitofinder (#!/usr/bin/python) by "#!"+the path to python 2 or python2.7 in your system (you can find the path using which python2.7).

Sorry for the inconvenience, Rémi

teaganme commented 1 year ago

Just edited that, and alas no change. I got the same error: "ERROR: Gene named "C0X1" in the reference file(s) is not recognized by MitoFinder."
No worries, thank you so much for your help!

RemiAllio commented 1 year ago

Ok this one makes sense! The problem comes from the name of the gene. Indeed, the cytochrome c oxidase subunit 1 is named C0X1 (with a zero) instead of COX1. I've never seen that before. MitoFinder doesn't know this synonym for COX1, so that's why you got the error. You should be able to work around this issue by just renaming C0X1 to COX1 in the reference file!

Let me know if it works that way, Best, Rémi

teaganme commented 1 year ago

I got my first real output, it all seems to be working fine now! Thank you much!! :)