Open rohank63 opened 3 years ago
Handle with or after #72.
@balajtimate: In this issue, please create a short list of all the strategies we dsicussed to improve the library source inference
As both the library type and the orientation inference relies on the inferred library source, it's extremely important to improve the inference. The key points from #108 and other discussions:
hsapiens
, mmusculus
, athaliana
, drerio
, rnorvegicus
, zmays
, mmulatta
, scerevisiae
, osativa
, btaurus
, sscrofa
, celegans
, ggallus
ecoli
, so add the RP genes from Ensembl BacteriaThanks! To clarify: What exactly do you mean by "This should focus" in 2. What is "This" and how to make "this" focus on just the listed organisms?
I meant adding more genes (other than the RP genes) should focus on the 15 most common organisms, to have greater precision in the lib source inference of those organisms (at least).
Thanks. Any concrete ideas how such a strategy could look like? I mean, how to find genes that are broadly conserved while at the same time maximizing the difference between the most common orgs? I don't really see how to start with such an exercise. Or were you suggesting to not care about the conservation beyond the most common organisms at all? And then maybe have a 2-stage process - look first at the broadly conserved (current) genes and then, based on the results for that, pick another subset of genes for better resolution?