dynverse / anndata

Annotated multivariate observation data in R
https://anndata.dynverse.org
Other
43 stars 4 forks source link

Removing metadata columns #31

Open dennishamrick opened 1 month ago

dennishamrick commented 1 month ago

Hello!

I have a large h5ad file, named asbrain2.h5ad. I downloaded this from the website CellxGene, can be found here: https://datasets.cellxgene.cziscience.com/427090ae-5e21-421f-b178-e66a930ca63c.h5ad

I load it into R using the anndata package (full list of each metadata column has been truncated by me): > brain2 AnnData object with n_obs × n_vars = 1915592 × 1122 obs: 'organism_ontology_term_id', 'donor_id', 'development_stage_ontology_term_id', 'sex_ontology_term_id', 'major_brain_region', 'cell_type'... (29 obs_keys in total) var: 'gene_name', 'feature_is_filtered', 'feature_name'... (6 var_keys in total) uns: 'citation', 'schema_reference'... (4 uns_keys in total) obsm: 'X_CCF', 'X_spatial_coords', 'X_umap'

Now, my question is: Is there a method in R using the anndata package to delete the metadata columns I don't need en masse or at once? brain2$obs$organism_ontology_term_id <- NULL will delete that column or whichever I specify. I could probably figure out something with a for loop, but I feel like there must be a way for me to just specify the column names I want to keep and save those while preserving the cell counts. Several methods I have tried have resulted in python errors.

This may be due to me missing something w/r/t syntax for subsetting in R, of course.

Thanks for any assistance.

rcannood commented 1 week ago

Hi @dennishamrick !

You could use:

ad$obs <- ad$obs[,c("columns", "you", "want", "to", "keep")]

Does that work for you?

dennishamrick commented 3 days ago

Hi @dennishamrick !

You could use:


ad$obs <- ad$obs[,c("columns", "you", "want", "to", "keep")]

Does that work for you?

Hi!

This works swimmingly. Thank you so much.