Closed beyondpie closed 6 months ago
Currently the only backend is hdf5. So setting this parameter has no effect. subset
will subset the AnnData inplace. If your AnnData is in read-only mode, setting out=None
will cause an error. Do you want to save the AnnData subset in memory?
Hi Kai,
Thanks for the reply! In my usage, raw data is quite large, so I typically load it using backed as 'r'. Then I can consider subset of it, and load that subset into memory for downstream analysis. I will use pipeline tool to run simultaneously different part of Anndata in parallel.
I think subset can be inplace or not inplace. People may want to use part of the data without influencing the raw data.
I mainly use subset of the data in memory. And may save that part into another file, but this is not that common. Currently, I have to save the subset of the data somewhere and load it into memory later.
I usually get confused about 'backed' and 'backend'. Sorry for this.
Thanks! Songpeng
Hi Kai,
Thanks for the reply! In my usage, raw data is quite large, so I typically load it using backed as 'r'. Then I can consider subset of it, and load that subset into memory for downstream analysis. I will use pipeline tool to run simultaneously different part of Anndata in parallel.
- I think subset can be inplace or not inplace. People may want to use part of the data without influencing the raw data.
- I mainly use subset of the data in memory. And may save that part into another file, but this is not that common. Currently, I have to save the subset of the data somewhere and load it into memory later.
- I usually get confused about 'backed' and 'backend'. Sorry for this.
Thanks! Songpeng
That makes sense! I'll modify subset
to add this functionality.
Implemented: https://kzhang.org/SnapATAC2/version/dev/api/_autosummary/snapatac2.AnnData.subset.html#snapatac2.AnnData.subset
A nightly release will be automatically built and released tomorrow.
@kaizhang Hi Kai,
I update SnapATAC2 to 2.6.
When I use subset function, I have the error below.
thread 'Result::unwrap()
on an Err
value: H5Ldelete(): unable to delete link: no write intent on file
note: run with RUST_BACKTRACE=1
environment variable to display a backtrace
Here is how I run a typical subset command:
sub_ann = ann_fm.subset(
obs_indices = cellmeta.exp.isin([ee]).to_list(),
out = None,
)
And ann_fm here is loaded using snapatac2.read with backed as r.
Do you have any suggestions? Thanks! Songpeng
You need to add inplace=False. This is necessary as "out" is now used to indicate whether the new AnnData should be backed or not.
@kaizhang https://kzhang.org/SnapATAC2/api/_autosummary/snapatac2.AnnData.subset.html#snapatac2.AnnData.subset There is no inplace parameter? Also, I notice that some API has the link to source code (but link might be broken), some does not. Songpeng
This feature exits only in the nightly version: https://kzhang.org/SnapATAC2/version/dev/api/_autosummary/snapatac2.AnnData.subset.html
@kaizhang Oh, I see. I thought it was already in the latest stable version. Thanks, Kai. Songpeng
Hi !
I passed the parameter out="src/myfolder/myfile.h5ad" but what I observe is that I got a new file "myfile.h5ad.h5ad" (2 times ".h5ad") created in src instead. Did someone has a proper working out argument ?
Thanks a lot
Hi Kai,
If I set None to out parameter of AnnData subset, I will have error. In the meanwhile, what's the meaning of ~backend~ in that function? Can I use "r" to save memory and "r+" to read subset into memory?
Thanks! Songpeng