EI-CoreBioinformatics / mikado

Mikado is a lightweight Python3 pipeline whose purpose is to facilitate the identification of expressed loci from RNA-Seq data * and to select the best models in each locus.
https://mikado.readthedocs.io/en/stable/
GNU Lesser General Public License v3.0
94 stars 18 forks source link

[Daijin] Allow for Diamond instead of BLASTX #93

Closed lucventurini closed 8 years ago

lucventurini commented 8 years ago

I just had a look at DIAMOND again. They added some time ago the capability of obtaining XML output instead of only tabular. I have to say I am hugely impressed ... it was able to perform the alignment of ~20,000 sequences in less than three minutes for my Mikado run.

Given the purpose for which we use BLAST (ie finding chimeras), we do not need super-accurate results. Giving the users the chance of using Diamond instead of BLASTX might cut down the processing time massively, leaving TransDecoder as the only choke point of the pipeline.

lucventurini commented 8 years ago

Update - now Daijin supports DIAMOND. Caveats:

maplesond commented 8 years ago

Cool. I found results to be very different between BLAST and diamond last year but agree the runtime is attractive. If diamond is behaving sufficiently well for mikado then this is great. I think dynamic selection might be tricky due to controlling all the additional options. It's best to leave this in the users hands.

lucventurini commented 8 years ago

I have to verify that last bit about the results (yesterday I was just trying to make it work, period). The problem I have at the moment is that I have no way of choosing at runtime which tool to use. There is the ruleorder: key in snakemake file, but it is unclear how one can modify it dynamically.

lucventurini commented 8 years ago

This mail exchange seems to suggest that it should be possible to obtain the desired behaviour with an if-else statement and a boolean variable in the configuration dictionary. This would be great!

https://groups.google.com/forum/#!searchin/snakemake/ambiguous%7Csort:relevance/snakemake/qX7RfXDTDe4/rfsF3UrQAAAJ

lucventurini commented 8 years ago

If-else behaviour implemented, and accessible through a not-advertised-for option in the configuration file. Closing for now, at least until Diamond solves its own output problems.