Closed erinyoung closed 3 months ago
Hmmm, that's an interesting one. It half fits the scope, but I'm a little wary of the assembly bit. @nf-core/taxprofiler what do you think? (and @maxibor ?)
I am not sure into what category this tool would fall into. It seems a bit specific to me and agree regarding the assembly part.
So to me basically it:
I think conceptually this would actually fit. Just rather than short-read alignment or kmer-comparison, it does 'long-read' comparison to a database (the main difference is that it generates the 'long reads' itself).
Given that this is the first time we see this request, maybe it'd make sense for @erinyoung to adapt the taxprofiler pipeline for their purposes as a proof-of-concept, and then we decide if/how to adopt it?
Given that this is the first time we see this request, maybe it'd make sense for @erinyoung to adapt the taxprofiler pipeline for their purposes as a proof-of-concept, and then we decide if/how to adopt it?
What do you mean by PoC - as in make a fork, add it, and see if it makes sense?
I think conceptually it does what we want (I just need to check the output), it's just outside our typical direct kmer/alignment of reads concept
I just had a quick look: @erinyoung does the tool at all produce a OTU/taxon like table as output at all? I tried to look through and couldn't find anything like that. The closest thing to a table was listing gene loci rather than species
What do you mean by PoC - as in make a fork, add it, and see if it makes sense?
As PoC, I meant to add the modules and make data flow adjustments needed to get the pipeline to work as needed for the purpose, yes.
I created a nf-core module for getOrganelle (https://github.com/nf-core/modules/pull/4484). The output is a fasta file with either complete or partial organelle/plasmidome sequences.
Thanks @erinyoung !
So if the output of the module is simply fasta files, I don't consider that in scope for taxprofiler - as that means it is simply just an assembler.
However I saw there is this utility function: https://github.com/Kinggerm/GetOrganelle/wiki/Usage#summary_get_organelle_outputpy
Depending on what the output of that looks like, this may sort of make it fit.
My apologies, but I've encountered other priorities. I may get back into the issue at a later, but am closing this for now.
Description of feature
GetOrganelle is a great tool for identifying organelles (like mitochondria).
My use case is sequencing from a mosquito pool and mitochondria can be more effective in identifying blood source.
The command for my use-case is something like
Although GetOrganelle has more features and use cases (https://github.com/Kinggerm/GetOrganelle#recipes).
There are some nuances with this tool, though (for example, this should ideally be after host removal, but the MT sequences can't be filtered out with the host removal).
Right now, there's not an nf-core module for GetOrganelle, but I can put one together.