Closed sjspielman closed 2 months ago
Just realized that I never weighed in on this when asked: I think having calculate_clusters()
work on matrices and objects is just fine. The extract function is already going to work with Seurat
and SingleCellExperiment
objects, so I don't see much cost to applying the same kinds of logic to look for matrix objects as well!
I also expect that the "conversion" cost will be quite small compared to the clustering itself, so while I would also write the sweep function to extract first, I would not worry that somebody else might not!
Closed by #758 and #759
If you are filing this issue based on a specific GitHub Discussion, please link to the relevant Discussion.
https://github.com/AlexsLemonade/OpenScPCA-analysis/discussions/727 & https://github.com/AlexsLemonade/OpenScPCA-analysis/discussions/731
Describe the goals of the changes to the analysis module.
This issue tracks adding a function to extract and return a PCA matrix, with barcode row names, from a given object, which could be either an SCE or Seurat object. I plan to have the function determine directly if the object is SCE or Seurat rather than using an additional argument. This function should be flexible enough for users to be able to specify the name of the PCA matrix in the given object, but we can still have a default.
As I alluded to in this comment, I am currently thinking this should be a helper function (that may not be exported? though i suppose can't hurt?) that the
calculate_clusters()
function (currently being written in #749) can use. For that, we'd need to updatecalculate_clusters()
to take an object instead of a matrix and call the helper function to convert it.A benefit to this alternative approach is users only would need to use
calculate_clusters()
rather than having to extract the matrix themselves, which may be more user-friendly. But, thinking towards the future, we'll also want to be wrapping thecalculate_clusters()
function to be able to sweep parameters (noting that we want to sweep ourselves, not withbluster::clusterSweep()
, since we want to ensure the seed is set fresh for each parameter iteration), and we'd only like to convert once for that. So, perhaps we allow for either a matrix or object to be passed intocalculate_clusters()
(I don't love the lack of consistency there but I am not strictly opposed to it)? Or, just keep this function separate and users can extract themselves.Tagging @jashapiro for any thoughts here!
What will your pull request contain?
Function and tests.
Will you require additional software beyond what is already in the analysis module?
N/A
Will you require different computational resources beyond what the analysis module already uses?
N/A
If known, when do you expect to file the pull request?
This sprint