simonwm / tacco

TACCO: Transfer of Annotations to Cells and their COmbinations
BSD 3-Clause "New" or "Revised" License
44 stars 1 forks source link

How to split objects #1

Closed kxxxjo closed 1 year ago

kxxxjo commented 2 years ago

Hi all,

First of all, thanks for developing useful tools for analysis.

Thanks to your tacco, I have got compositional matrix by cell type. (using below command line)

tc.tl.annotate(adata, reference, annotation_key='Annotation', result_key='my_compositional_annotation')

But, I also want to obtain compositionally annotated count data that is amenable to standard downstream single cell analysis workflows.

Could you please let me know how to split objects by cell types or what command to use?

Thanks!

Best, KJ

simonwm commented 2 years ago

Hi @kxxxjo,

thanks for using Tacco!

Did you try tc.tl.split_observations? This might just be what you are looking for. Basic usage in your case would be like

sdata = tc.tl.split_observations(adata, annotation_key='my_compositional_annotation', result_key='my_categorical_annotation')

and sdata then holds the split expression profiles corresponding to "pure" single cell type profiles each.

We are planning to add more examples to the documentation in the near future, and the split functionality will definitively be a part of it.

Hope this helps!

kxxxjo commented 2 years ago

Thanks for quick and amiable reply.

I generated count table by cell type through tc.tl.split_obseravations you recommended.

I wonder what algorithm was used to get this.

According to your paper, tacco used matrix-scaling algorithm to obtain categorically annotated expression data.

Is that right?

Thanks a lot!

Best, KJ

simonwm commented 1 year ago

Basically, yes. The algorithm is known under many names, like Sinkhorn or RAS algorithm, and the mathematical problem this algorithm solves is the matrix scaling problem.