pinellolab / STREAM

STREAM: Single-cell Trajectories Reconstruction, Exploration And Mapping of single-cell data
http://stream.pinellolab.org
GNU Affero General Public License v3.0
173 stars 46 forks source link

down sampling in st.map_new_data #70

Open ccshao opened 4 years ago

ccshao commented 4 years ago

Hi stream team,

When performing the stream.map_new_data, I guess that the cells from two different stream objects are not down sampling to get equal number of cells? This might lead to misundering when comparing the area in the mapping plots. How do you think?

Is there a way to perform down sampling by stream? Or down sampling has to be performed in the very beginning of the stream pipeline?

Thanks!

huidongchen commented 4 years ago

Currently we don't perform any downsampling in mapping new data. And I'm afraid you have to do that yourself if needed.

It's not clear to me though how down-sampling will help or what you mean by 'This might lead to misunderstanding'. Can you give a bit more detail?

ccshao commented 4 years ago

Sorry for the late update, here is a hypothetical example.

There are three cell subtypes (A, B, C) in the cond1 and cond2, while in cond1 the cell numbers are 200 in each subtype, while in cond2 the cell numbers are 400, 200, 400, respectively.

Then we aligned cond2 to cond1 with stream. Assume we ignore the areas shape of three cell subtypes in cond1 and cond2, the size of area of B are same as as there are 200 cells in both cond1 and cond2, while more A and C in cond2.

However, there are more cells in cond2. For us, we would like to view that there are few B cells in cond2, if considering equal sizes of total cell populations, i.e, 240, 120, 240. That why we think downsampling might be helpful.

Hope this simplified scenario makes sense to you.