support change the selected ids to new category and add a new cluster to anndata.obs

lilab-bcb / cirrocumulus

Bring your single-cell data to life

https://cirrocumulus.readthedocs.io/

BSD 3-Clause "New" or "Revised" License

72 stars 16 forks source link

support change the selected ids to new category and add a new cluster to anndata.obs #99

Open ditannan opened 2 years ago

ditannan commented 2 years ago

when I selected ids, I can Download Selected IDs, would it possiable I can change the category name of selected ids and update the clusters to a new cluster name. The anndata should be updated and can be downloaded.

ditannan commented 2 years ago

kind of:

def change_cluster(adata, old_cluster, new_category_ids, new_category_name=None, new_cluster_name=None):
    adata.obs[new_cluster_name] = adata.obs[old_cluster]
    adata.obs[new_cluster_name] = adata.obs[new_cluster_name].cat.add_categories(new_category_name)
    adata.obs.loc[new_category_ids, new_cluster_name] = new_category_name
    return adata

joshua-gould commented 2 years ago

Please see the attached gif for annotating clusters. When using cirro launch, the renamed clusters are available in a JSON file stored next to the input h5ad. Does this work?

annotate

ditannan commented 2 years ago

Please see the attached gif for annotating clusters. When using cirro launch, the renamed clusters are available in a JSON file stored next to the input h5ad. Does this work?

Thank you for your quick reply. Annotating the cluster seems to change the category name for a whole cluster, what I want is changing some samples (selected ids) of one category or two categories to another new category.

joshua-gould commented 2 years ago

You can download the selected ids for each group and then add a new field in anndata.obs. Does this work?

import pandas as pd
ids1 = pd.read_csv('selection.txt', header=None, index_col=0).index
ids2 = pd.read_csv('selection (1).txt', header=None, index_col=0).index
obs_field = 'my_group'
adata.obs.loc[ids1, obs_field] = 'A'
adata.obs.loc[ids2, obs_field] = 'B'
adata.obs[obs_field] = adata.obs[obs_field].astype('category')

ditannan commented 2 years ago

You can download the selected ids for each group and then add a new field in anndata.obs. Does this work?

import pandas as pd
ids1 = pd.read_csv('selection.txt', header=None, index_col=0).index
ids2 = pd.read_csv('selection (1).txt', header=None, index_col=0).index
obs_field = 'my_group'
adata.obs.loc[ids1, obs_field] = 'A'
adata.obs.loc[ids2, obs_field] = 'B'
adata.obs[obs_field] = adata.obs[obs_field].astype('category')

Thank you, it works. If this feature can be added to the cirrocumulus?