raymondlouie / MiniMarS

4 stars 2 forks source link

Claire's comment: how many markers to choose? #29

Open raymondlouie opened 1 year ago

raymondlouie commented 1 year ago

I’m a bit confused by the “minimum marker” aspect of the package. It seems like we arbitrarily select a number of markers that we want, e.g., 15 markers, but it’s unclear how this translates to the performance in identifying or classifying cells. How do we choose the correct number of markers to use? Because there isn’t a performance metric, how do we determine whether we should use 5, 15, or 50 markers? I think this is especially important for the protein data sets, as users might be trying to derive a gating strategy from the panel that could be used to identify these cells for downstream experiments, e.g., sorting.

raymondlouie commented 1 year ago

I can implement a seperate function which sweeps through the number of markers, and outputs the performance of each number?

anglixue commented 1 year ago

My understanding is that the minimum depends on (1) the number of clusters (2) prediction accuracy on each cluster (or specific cluster of user's interest), and (3) technical details of gating strategy.

From a statistical perspective, we can say something like, "to achieve 90% median prediction accuracy for 14 cell types, we can use 15 markers at minimum".