Open Flu09 opened 6 months ago
Hi @Flu09, either approach should be fine and all downstream operations should work regardless of the choice you make.
saveRDS()
is probably the easiest approach, just remember that the BPCells object will store the absolute path for its input files, so if those files are moved/deleted the RDS object won't be able to find the data.
Alternatively, you can merge multiple matrices into a single file by calling write_matrix_dir()
, which will write a new file to disk. This can be preferable if you have a large number of samples to improve performance, or if you want to more easily be able to copy the matrix files to a new computer.
Hope that helps -Ben
Thanks. May I ask about another issue? but i feel it might be related more to Seurat.
I faced this error when integrating. Is something wrong with the structure of the object?
merged <- NormalizeData(merged) merged[["RNA"]] <- split(merged[["RNA"]], f = merged$dataset) merged <- JoinLayers(merged) merged <- FindVariableFeatures(merged) merged <- SketchData(object = merged, ncells = 10000, method = "LeverageScore", sketched.assay = "sketch") DefaultAssay(merged) <- "sketch" merged <- FindVariableFeatures(merged, verbose = F) merged <- ScaleData(merged, verbose = F) merged <- RunPCA(merged, verbose = F)
merged <- IntegrateLayers(
- object = merged, method = FastMNNIntegration,
- new.reduction = "integrated.mnn",
- verbose = TRUE
- ) Converting layers to SingleCellExperiment Running fastMNN Error in validObject(.Object) : invalid class "ScaledMatrix" object: the supplied seed must support extract_array()
This error looks like it's more on the Seurat side. From the error message, I think that Seurat must be calling a function that expects to receive a DelayedArray
object but it is instead getting a BPCells object passed to it.
I think the underlying bioconductor package batchelor
with its function fastMNN
is probably the limiting factor here -- the algorithm would probably work if passed PCA dimensions directly but it doesn't seem to support that kind of input, hence why it even gets passed a BPCells matrix in the first place (which it then doesn't know how to deal with). It's possible the Seurat folks could come up with a workaround, but there's not a good way I'm aware of that could fix this from the BPCells side.
Thank you. I want to ask again about saving the RDS object.
What I understood is that we could move the BP folder to a new location.
I moved the contents from /tmp/obj2
to/tmp/tmp/obj2
counts.mat.obj2 <- open_matrix_dir(dir = "/tmp/tmp/obj2")
counts.mat.obj2
obj2[["RNA"]]$counts <- counts.mat.obj2
markers <- FindMarkers(obj2, ident.1 = 23, ident.2 = 3)
Error: Missing directory: /tmp/obj2
Another question I have is how to save the metadata of a BP object then read them back if needed
From your example, I think a problem you have is that you have already normalized your Seurat object prior to moving the BPCells folder. Therefore, the data layer (obj2[["RNA"]]$data
) will still be a BPCells object that points to the old directory. It is possible to manually adjust that object as well, using all_matrix_inputs()
. e.g.:
all_matrix_inputs(obj2[["RNA"]]$data) <- list(open_matrix_dir(dir="/tmp/tmp/obj2"))
This can be somewhat error-prone if you are merging together several different data sources. In general I would recommend not moving the data for a BPCells-based project more than you absolutely have to. (note that BPCells doesn't modify files on disk unless you explicitly call write_matrix_dir()
with overwrite=TRUE
, so multiple objects can read from the same data source without interfering with each other)
As for saving metadata, BPCells itself doesn't handle very much metadata, just row names and column names for matrices. Most of Seurat's metadata is handled by Seurat itself and doesn't get put into BPCells. So in that case, a normal saveRDS
should suffice to store/load your Seurat object.
I had multiple seurat objects and I wrote each to disk then merged them. How to save this object ? Do I just saveRDS()? or do I write it into a matrix first?
write_matrix_dir(mat = object1[["RNA"]]$counts, dir = '/tmp/object1') counts.mat.object1 <- open_matrix_dir(dir = "/tmp/object1") counts.mat.object1 object1[["RNA"]]$counts <- counts.mat.object1
write_matrix_dir(mat = object2[["RNA"]]$counts, dir = '/tmp/object2') counts.mat.object2 <- open_matrix_dir(dir = "/tmp/object2") counts.mat.object2 object2[["RNA"]]$counts <- counts.mat.object2
merged <- merge (object1, c(object2, object3, object4))