Closed lazappi closed 1 month ago
The test/example stuff I would like to do but requires a working Seurat
converter. I think maybe it's easier to merge this first? The vignette would be nice but can be done later.
HDF5AnnData
? The read
/write
functions handle going to/from files and I can't think of a reason to interact with one of these objects directlyAnnData()
function I think is the same same as in the R {anndata} package. I'm not sure how much of an issue that is but maybe we should avoid clashing just in case?@lazappi Just to confirm, after what we discussed, do you agree with the following?
AnnData()
will return an InMemoryAnnData
(default) or an HDF5AnnData
if the user really wants to. It should be noted that this is only for users who know what they're doing.adata$to_SingleCellExperiment()
or adata$to_InMemoryAnnData()
, but it would be nice if this is also possible with as(adata, "SingleCellExperiment")
.InMemoryAnnData
and HDF5AnnData
) and internal functions (from_*
and to_*
) are not exported.read_h5ad()
and write_h5ad()
to write their data from/to .h5ad
filesI'd like to merge this PR because I agree that it'd be nice to clean up our list of exports. Would you be able to make the changes to AnnData()
?
Yes, I think so. I'll try to work on it, assuming I can work out how to implement as()
properly.
Ok, so using as()
probably won't work for us because there is no way to provide additional arguments (which I think we need for things like setting which assay should be X
).
Possible alternatives:
AnnData()
for going SCE/Seurat -> AnnData
but only has the adata$to_*
interface for the reverseto_*
/from_*
functions directly (which is what I was trying to avoid)S3
type interface (not sure on the exact design)There is no way to provide additional arguments
Good point, I hadn't considered that.
Something else
Would you be ok with splitting it up into:
AnnData <- function(
obs_names = NULL,
var_names = NULL,
X = NULL,
obs = NULL,
var = NULL,
obsm = NULL,
varm = NULL,
obsp = NULL,
varp = NULL,
uns = NULL,
output_class = c("InMemoryAnnData", "HDF5AnnData",
...
)
And
as_AnnData <- function(
obj,
output_class = c("InMemoryAnnData", "HDF5AnnData"),
...
)
?
This way, the default as_AnnData
will be the inmemory one, which can be used to write to disk using write_h5ad
. I do like having the conversion separate from the regular constructor, because then we need to include code to make sure that the obj
and the [obs_names, var_names, X, varm, obsm, ...
arguments are mutually exclusive while they might as well just be split into two different functions.
Yeah, that could work. What about the reverse direction?
I'm not sure what you mean by the reverse direction.
Oh, you mean when we want to convert an AnnData to SCE or Seurat?
We can call adata$to_SingleCellExperiment()
and adata$to_Seurat()
. Is this what you mean?
I can't remember exactly but I think so, yes. I think we discussed and I wanted to avoid exposing these directly but I can't remember all the details.
I'm going to port these changes to a different branch because there are too many conflicts by now (sorry about that!)
In summary, I will:
AnnData()
to create InMemoryAnnData's (and convert them to something else)read_h5ad()
to open an h5ad file as an HDF5AnnDataas_AnnData()
to convert a Seurat object or a SingleCellExperiment.~from_Seurat()
to convert from a Seurat object, and use from_SingleCellExperiment()
to convert from a SingleCellExperiment.
I'm now thinking that instead of as_AnnData()
, the names from_Seurat
and from_SingleCellExperiment
would better explain what the purpose of those functions are, given that adata$to_Seurat()
and adata$to_SingleCellExperiment()
exists. In addition, it would allow parameters that are Seurat-specific and parameters that are SingleCellExperiment.Objects that will be removed from the NAMESPACE:
At some point in the future, we should create a separate roxygen doc or a vignette to explain what the different possible conversions are.
Tidy the user interface to reduce the exported functions to ones we think users should see.
Changes:
AnnData()
to acceptSCE
/Seurat
(replacingfrom_*
functions)generate_dataset()
Todo:
AnnData()
withSCE
/Seurat
inputread_h5ad()
/write_h5ad()
tests