can smoothed expression, potentially of just desired marker genes, be entered into cohorts to produce better results?
As an alternative to (1), can we weight the rows of the profiles matrix to advantage our favorite markers?
What about a workflow where you run semi-supervised with 0 new clusters - i.e. you use anchor cells to update profiles, then run fully supervised. Is this possible with the current code - i.e. by running insitutype with n_clusts = 0? Does this work well?
can we estimate platform effects and adjust reference profiles accordingly?
can we make anchor selection easier? (supporting plots, summary stats, better rules)
can we automatically identify cell types that deserve to be sub-clustered?
give better guidance on choosing n_clust (BIC doesn't always work - sometimes it chooses way too many clusters. Can we use another statistic? (One suggestion: use the BIC elbow instead of the minimum.)