Arcadia-Science / prehgt

A pipeline for lightweight screening of Eukaryotic genomes and transcriptomes for recent HGT
MIT License
12 stars 6 forks source link

Genomes from other than genbank/ncbi #52

Open Nimshika opened 1 year ago

Nimshika commented 1 year ago

Is there a way to incorporate genomes into the pipeline that aren’t from genbank/refseq ?

taylorreiter commented 1 year ago

not at the moment, but that's definitely an emerging need! I can make a note to prioritize this type of integration.

Can you give an example of how you would want this to work? for example, do you have a genome with genome models already? do you only want to run the pipeline on that genome, or would you also like to search refseq/genbank for other genomes of the same genus to still include? Anything else that might give me pointers on the best way to integrate this would be much appreciated!

Nimshika commented 1 year ago

Thanks for getting back soon - it would be great to have that feature! We have genomes that we’ve sequenced and assembled, and would like to include those (maybe with an optional flag that specifies location of local directory containing assemblies, or something along those lines?) along with other genomes from refseq/genbank, within the same genus and to pull other relevant genera.

taylorreiter commented 1 year ago

ok great, this is very helpful context. Do you already have predicted ORFs for your genomes? Are they bacteria/archaea or eukaryotes?

Nimshika commented 1 year ago

We do not have predicted ORFs for the genomes which includes bacteria and yeasts.

taylorreiter commented 1 year ago

ok thank you! I'll think more about how to include this and post any updates here!

Nimshika commented 1 year ago

Any updates on this ?

taylorreiter commented 1 year ago

Hi @Nimshika! While this on our list, it isn't a priority for us at the moment so it will probably be some time until we can make the adjustments to allow for this. I'm sorry!