Open rob-p opened 7 years ago
Hi Rob,
I don't have any specific suggestions but we usually assume that most people are going to run the analysis with the default parameters set so we try to lean towards setting the parameters that are most likely to give the most accurate results even if it incurs some cost in terms of speed and what not. I'd bet the most common analysis people are doing is salmon -> tximport -> DESeq2 so defaults that attempt to maximize the accuracy of that would be awesome.
I like the opt-in structure, I think in general keeping models simple by default is a good way of doing things.
I know people just run programs without reading documentation and expect it to work perfectly. But I think somewhere, maybe even in the default quantification help, there should be a table or decision tree with information about how to choose options. E.g.
"Did you do random hexamer priming? -> Use the --seqBias option."
Hi all,
I just wanted to provide this space to start a discussion and get feedback about what people believe to be the most sensible default settings for Salmon (in different modes). We're happy to discuss any suggestions, but can start with some specific questions. Here is the most basic. Right now, Salmon has an "opt-in" philosophy. That is, a default run starts with the most basic features, and users opt-in to anything that has non-trivial cost (e.g. gibbs sampling, bias correction, and even things that have close to trivial cost but may not always be useful like dumping the equivalence classes to disk). Perhaps some of these defaults should be re-considered, or perhaps this philosophy makes sense as long as the "opt-in" behavior is made clear? It's worth noting that one current benefit of this "opt-in" mentality is that defaults are more consistent among data-types. For example, GC-bias modeling for single-end libraries is still a feature in testing (on the develop branch), and so could not reasonably be made default at the current time.