Closed mourisl closed 3 years ago
Dear @mourisl
Thank you for your interest in our tool, and for taking the time to thoroughly evaluate it against other approaches. We will try, in kind, to be as helpful as we can.
In order, for your questions:
In regards with transcript abundance, the method we use relies on either calculating the abundance post-prepare with e.g. kallisto
, or in gathering the abundance data manually. It can then be fed into mikado serialise
as an "external scores" samplesheet. Please see our documentation for details on the process.
Please note that, as explained in the section, in order for Mikado to actually use the abundances so provided, the scoring file needs to be amended appropriately.
I also found a typo in https://mikado.readthedocs.io/en/latest/Tutorial/#mikado-pick: Should it be "--subloci-out" instead of "--subloci_out"?
Yes, absolutely, thank you for pointing it out!
I would just add that while you can run mikado without the additional portcullis, blast or orf files it's not really advised and not really the point of the tool at the very least the orf files should be used.
@lucventurini Thank you for providing the details and documentation of the scoring file. I will modify the file accordingly and will let you know the results. @swarbred Thank you for the explanation. I actually tried blast file, but it did not affect the result. I will try the ORF file.
Dear @mourisl
We should have solved it in #354. We will update the documentation accordingly, but in a nutshell, adding scoring parameters with the form of "attributes.{MY_PARAMETER}".
So for example for TPMs, attribute tpm
(case sensitive):
attributes.tpm:
- default: 0
- rescaling: max
- rtype: float
- use_raw: false
I hope this helps.
Thank you for developing Mikado. After developing CLASS2 , we also developed PsICLASS (https://github.com/splicebox/PsiCLASS) to assemble transcripts for a single RNA-sample or multiple RNA-seq samples simultaneously. When processing multiple RNA-seq samples, PsiCLASS also reported the consensus meta GTF file from all the samples, and the current strategy is to just vote based on the abundances. This voting strategy was powerful and outperformed other available mergers, but it is pretty naive and we hope a more sophisticated approach could give better results. Though Mikado was not designed to merge multiple-sample GTFs, it can take multiple GTF files as input. I just gave it a try on the simulated data of 25 samples. Mikado reported decent results with minimal input (just the sample-wise gtf files, no BLAST, no ORF), and the results were slightly better than TACO already. Therefore I plan to explore the potential of Mikado a bit more, and have several questions:
I also found a typo in https://mikado.readthedocs.io/en/latest/Tutorial/#mikado-pick: Should it be "--subloci-out" instead of "--subloci_out"?
Thanks, Li