Closed slowkow closed 5 years ago
I made a change at some point to allow it. If a cell is not present in the other coordinate file, it will be hidden.
Yes, my plan was exactly this: supply multiple coord files. Then, making this clear in the UI may not be trivial, especially if we have multiple dimensionality reductions.
I guess my problem right now is that I don't have a concrete example dataset where this is the case and it makes the dataset easier to understand. It's always easier to work with a concrete example.
For tabula muris, I wanted to prep one coord file per tissue.
On Wed, Sep 19, 2018 at 11:33 AM, Kamil Slowikowski < notifications@github.com> wrote:
When working with large single-cell datasets, it is often useful to look at two or more levels. Several papers have done analysis at two or more levels:
- Level 1: PCA and tSNE on the full set of all cells.
- Level 2: PCA and tSNE on each major subset of cells (e.g. only T cells, or only B cells, or only fibroblasts).
When browsing the data at Level 2, it's necessary to hide all of the cells except the cells in the chosen subset. Then the user can browse just the different clusters of T cells, for example, without worrying about all the other cell types in an experiment.
It would be nice to support this type of subset-level analysis in cellBrowser.
Thinking about how this might be implemented...
I think you might already be most of the way there, since you support multiple files with cell coordinates:
https://github.com/maximilianh/cellBrowser/blob/ 03ca5f32610c571d6ada8b0db7062d5e77287b5f/sampleData/sample1/ cellbrowser.conf#L41-L45
What happens if one of the coordinate files only lists coordinates for a subset of cells instead of all cells?
(I haven't tried, so I apologize in advance if this is already supported and I'm not aware.) It would be cool if cellBrowser automatically figures out that it should hide the cells that are not listed in a given coordinate file.
I wonder if you have thoughts about how to organize and navigate these types of subset-level results?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/27, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TTwFUwsk7OAV8Ialh-ovFaL_IoAhks5ucmPhgaJpZM4WwbZ0 .
Here is an example from the AMP Phase 1 datasets. The overall tsne on PCs plot appears like this when you open the dataset:
Then, when you select the tsne on PCs layout for just B cells:
As you can see, all of the T cells, Monocytes and Fibroblasts have been assigned the coordinates 0,0 (~4,000 cells underneath the cell I selected):
The output when you run cbBuild:
WARNING:root:sample name S037_L3Q2_O12 is in meta file but not in coordinate file t-SNE on PCs: B Cells, setting to (0,0)
I think it's an easy fix, but it would be great if they were dropped rather than assigned (0,0).
The data arrays that I use (important for speed) cannot hold special values. I think it's easiest if I use a special and rare value (like (12345,12345) to indicate that a coordinate is missing.
Also, your example shows that I should not calculate the label coordinates based on missing values. That's a clear bug. Thanks!
This was implemented a few weeks ago. 12345 is a the special value for missing cells and the 100% ignores these. Let me know if you need something else, otherwise we can close this ticket.
Hey, while you can have subsets easily now, you'll still have the old cluster assignments. Is this what you want? If you want to auto-switch it to another field as the new cluster, I'd still have to add that, though that should be easy.
Also, there is no documentation about this right now...
added docs to the tab-sep docs page.
I think it's a great idea to add an "auto-switch", and I would take advantage of that capability for the datasets I'm working with, but it's not hugely pressing. Thank you!
Sorry it took so long, you can now add colorOnMeta= to any coordinate system to color by some meta field automatically when this coordinate system is loaded.
ms.cells.ucsc.edu uses this to activate pseudotime coloring automatically when you show the pseudotime layout.
On Fri, Mar 1, 2019 at 7:08 PM josephmears notifications@github.com wrote:
I think it's a great idea to add an "auto-switch", and I would take advantage of that capability for the datasets I'm working with, but it's not hugely pressing. Thank you!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/27#issuecomment-468756693, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TfWQGzsI46ur0ANUmygSdWAwgKfbks5vSWyQgaJpZM4WwbZ0 .
pip release 0.4.51
On Mon, Mar 11, 2019 at 4:07 PM Maximilian Haeussler maximilianh@gmail.com wrote:
Sorry it took so long, you can now add colorOnMeta= to any coordinate system to color by some meta field automatically when this coordinate system is loaded.
ms.cells.ucsc.edu uses this to activate pseudotime coloring automatically when you show the pseudotime layout.
On Fri, Mar 1, 2019 at 7:08 PM josephmears notifications@github.com wrote:
I think it's a great idea to add an "auto-switch", and I would take advantage of that capability for the datasets I'm working with, but it's not hugely pressing. Thank you!
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/maximilianh/cellBrowser/issues/27#issuecomment-468756693, or mute the thread https://github.com/notifications/unsubscribe-auth/AAS-TfWQGzsI46ur0ANUmygSdWAwgKfbks5vSWyQgaJpZM4WwbZ0 .
When working with large single-cell datasets, it is often useful to look at two or more levels. Several papers have done analysis at two or more levels:
When browsing the data at Level 2, it's necessary to hide all of the cells except the cells in the chosen subset. Then the user can browse just the different clusters of T cells, for example, without worrying about all the other cell types in an experiment.
It would be nice to support this type of subset-level analysis in cellBrowser.
Thinking about how this might be implemented...
I think you might already be most of the way there, since you support multiple files with cell coordinates:
https://github.com/maximilianh/cellBrowser/blob/03ca5f32610c571d6ada8b0db7062d5e77287b5f/sampleData/sample1/cellbrowser.conf#L41-L45
What happens if one of the coordinate files only lists coordinates for a subset of cells instead of all cells?
(I haven't tried, so I apologize in advance if this is already supported and I'm not aware.) It would be cool if cellBrowser automatically figures out that it should hide the cells that are not listed in a given coordinate file.
I wonder if you have thoughts about how to organize and navigate these types of subset-level results?