Open PeteHaitch opened 11 months ago
About to sleep but the rbind
approach seems reasonable if you want the ranks to be comparable. But comes with some performance loss because the current scran falls back to block processing (though this would not be a problem if it was refactored to use libscran). Otherwise 2 is also fine but also requires some recompute of the ranks.
Somewhat thinking out loud here, but I'm interested in your ideas.
For multimodal data (e.g., GEX and ADT), we might be interested in using both modalities (simultaneously) to define markers. I've been doing this by
rbind()
-ing thelogcounts()
of each modality (along with some tidying up the rownames by prepending the ADT feature names byADT
), and then runningscoreMarkers()
on that, but this requires allocating another (potentially large) matrix.I guess I've got a few questions:
rbind()
could be a delayed op, but I'm not sure when this would get realised by the scran machinery and so I'm unsure if this is worthwhile?applySCE(sce, scoreMarkers())
gets very close, but therank.*
statistics are then computed separately for each modality and so won't be the same as if they were computed jointly on all modalities (the other statistics yield identical results whether computed separately or jointly on all modalities). Perhaps runningscoreMarkers(full.stats = TRUE)
and then re-computing therank.*
statistics withcomputeMinRank()
applied to thefull.*
columns would work?scoreMarkers()
/findMarkers()
interface for multimodal data look like?