Open SamGG opened 7 months ago
Thanks for this! I agree that this is a weak point. I think the most suitable place for this would actually be in .check_k
here, independent of plotDiffHeatmap
. Specifically, all functions accepting a k
argument typically call k <- .check_k(sce, k)
and then use cluster_ids(sce, k)
to retrieve the corresponding assignments. So, .check_k
could include a warning when multiple clusterings match k
, as you suggested - agreed? I'll try to implement this soon, and thank you kindly again.
Thanks for your feedback. Here are my thoughts, probably not crystal clear.
y
diffcyt object with the levels of the x
CATALYST object. This goal could not be addressed by .check_k because .check_k works only x
..check_k
and cluster_ids
functions, it's now clear to me that it is not possible to add a column to cluster_codes. cluster_codes and cluster_ids are intimately linked together. The aims of cluster_codes is to store the hierarchical merging of high a resolution clustering into lower resolution clusterings. There could be only one single high resolution clustering in cluster_codes. If I want to add a clustering (a high resolution clustering, e.g. the result of kmeans), I need to replace cluster_ids AND cluster_codes together. That's what you wrote in section 8.2 of the vignette, but I read it too quickly.So, I think that:
.check_k
.We can discuss this offline if you want. Sorry for my misunderstanding.
IIUC, when multiple columns of
cluster_codes
could match the levels ofcluster_id
, the first one is selected silently. https://github.com/HelenaLC/CATALYST/blob/f3e294ed9a4d3f300feb994bb381df6a6b2c8309/R/plotDiffHeatmap.R#L140-L151IMO, if
sum(same) > 1
a warning must be raised show the selected column/clustering, or, an error must be raised to force the user to select the clustering column (k
argument).