digitalcytometry / ecotyper

EcoTyper is a machine learning framework for large-scale identification of cell states and cellular ecosystems from gene expression data.
Other
177 stars 41 forks source link

the stat_assignment table is not equal with state color in the heatmap and is not the most abundant state #16

Closed wanhui5867 closed 2 years ago

wanhui5867 commented 2 years ago

Hi,

Thanks for the great tools!

I tried my samples with the ecotype_discovery_bulk function. With the default setting, it runs successfully, but when I checked the results, I found two problems.

  1. The state assigned to each sample in the stat_assignment table is not the same colour in the heatmap. Let's use MC104R as an example, it was assigned to S02, but In the heatmap, all S02 colours are red (S01). I think the problem is that the state number in the right heatmap is not equal to the left one, could you modify the script? image

image

  1. The assigned state was not the most abundant one in the state_abundances table. We still use MC104R as an example, the highest abundant is S04 in the state_abundances table (please see below screenshot), but it was assigned to S02 in the stat_assignment table. image

Besides, the problem of assigned cell state/ecotype is not the most abundant one is also found in the De novo Discovery function (please check your DiscoverOutput result), but it seems correct in ecotype assignment in the Recovery function. Could you figure it out?

Thanks in advance!

Best, Hui

BALuca commented 2 years ago

Hi Hui,

I double checked, and the heatmap annotation colors are displayed correctly, even when some cell states are missing. For me this works both in R v3.5.1 and v4.0.2. Are you sure that the heatmap and the excel tables are from the same cell type? If everything seems correct to you, would it be possible to share the data you tried on ecotyper.stanford@gmail.com, and indicate the cell type where you observe this behavior? Regarding the second point, you shouldn't have values >1 in the state abundance table. EcoTyper specifically normalizes so that the state abundances are in the range 0-1. I would check that Excel doesn't auto-format the values in a weird way. If you are positive that EcoTyper outputs values >1, please send us the data for debug and indicate the cell type where you observe this behavior.

Best, The EcoTyper team

wanhui5867 commented 2 years ago

Hi,

Thanks for your checking and reply!

I am sure I use the same excel table and heatmap. Actually, you can see the problem just from the heatmap, for example, the white border of the right heatmap is the same as the S02 border of the left heatmap, but the state colour in the right is S01. I will share my raw data and results with you via email.

For point2, you are right. It is the problem of Excel format, I found the scientific e notation in the end. The assigned state is the most abundant one.

Thanks again, Hui

wanhui5867 commented 2 years ago

The issue was solved by the authors.