Comparing between datasets is potentially misleading. Differences may be due to differences in
sequencing depth - easy to report; easy to use.
quality of annotation/clustering - We have no real way to judge this at present. In future we could potentially look at comparisons to split bulk and some kind of Blast between cell types in different datasets (=> some measure of consistency). IN absence of these - enough to report paper
origin of cells - sex, tissue etc. report by dataset or cell set?
number of cells (can we give a ballpark for what would be a low number that might skew stats?
Users need a quick way to visualise gene expression comparison when filtered down to small numbers of gene by function --> aggregate and sort
user starting points:
some cell type names
individual cells from a connectomics query
Given a set of cell types, which datasets are available that have data on all cell types, at what granularity and how many cells for each type?
If multiple dataset have data on the cell types of interest report: sequencing depth, tissue inputs, number of cells, pub
Query to compare expression between specified cell types on chosen dataset - option to aggregate on
The problems: