Kinggerm / GetOrganelle

Organelle Genome Assembly Toolkit (Chloroplast/Mitocondrial/ITS)
GNU General Public License v3.0
283 stars 54 forks source link

animal_nr #136

Open dleopold opened 2 years ago

dleopold commented 2 years ago

I am wondering if there is a reason the there is no animal_nr seed database. If I want to assembly rRNA gene regions for animals, would I need to build my own? Or is GetOrganelle not an good tool for this?

Kinggerm commented 2 years ago

Thanks for reaching out! This is a good question because no one has requested it yet, which is also a quick answer to your question. I would not expect animal_nr to be very different. We can put this on the schedule if you want.

We originally designed GetOrganelle, as the name says, for organelle genome assembly. We added plant_nr simply because we are plant guys. And some of my colleagues studied fungi and requested fungus_nr. They verified the fungus_nr database I made, which I was unfamiliar with, to be helpful. GetOrganelle can be potentially efficient for calling any high-copy regions in WGS sequencing data, so I further added an anonym mode for more diverse purposes, where you can find your temporary solution if I cannot make the update timely, as always.

Another excuse for not having explored more on nuclear ribosomal RNA is the incomplete concerted evolution issue, which can be painful for many taxa. We even did not include it in our GetOrganelle paper. We can be motivated if more people are interested, though.

dleopold commented 2 years ago

It would be great to add animal_nr to GetOrganelle. I will probably give the anonym approach a try sometime soon, but would be happy to contribute to the overall project. If I develop a database that works, is there a mechanism to compile the results into a shareable database that could be used by others?

Kinggerm commented 2 years ago

It will be great if you want to contribute to the community with an animal_nr database.

GetOrganelleDB is where GetOrganelle pulls the default database and where you may want to fork and share the database. For this, I just added a section https://github.com/Kinggerm/GetOrganelleDB#how-to-contribute to the GetOrganelleDB repository.

gacsinger commented 2 years ago

I can confirm that using -F anonym with custom -s and --genes reference files has worked pretty well in my particular case, It would be terrific if this was added as a standard feature.

timz0605 commented 2 years ago

Hello, I was wondering about the same issue and, as an animal person, I would love to see updates with animal_nr in the future!

timz0605 commented 2 years ago

@gacsinger Hello, I am wondering what fasta file you are using as the seed and what file you are using as the label database?

max-baer commented 1 year ago

Hi people, wanted to ask if there has been any update since the last comment of @gacsinger? Would you still use the anonym function to assemble Ribosomal gene regions in animal WGS data or has there been some sort of standard feature implementation?