CyTOF data - Githubissues

hammer commented 5 years ago

Possible data:

Bystander CD8+ T cells are abundant and phenotypically distinct in human tumour infiltrates: https://flowrepository.org/id/FR-FCM-ZYWM
Epigenomic-Guided Mass Cytometry Profiling Reveals Disease-Specific Features of Exhausted CD8 T Cells: https://premium.cytobank.org/cytobank/experiments/154556/illustrations/210224 and https://premium.cytobank.org/cytobank/experiments/154563/illustrations/210223; emailed first author to add our emails to the experiments
Identity and Diversity of Human Peripheral Th and T Regulatory Cells Defined by Single-Cell Mass Cytometry looks great and the analysis was done on Cytobank but I do not see a Cytobank ID in the paper so have emailed the study author.
Innate Immune Landscape in Early Lung Adenocarcinoma by Paired Single-Cell Analyses: title is misleading; https://flowrepository.org/id/FR-FCM-ZY9M has a bunch of T cell data from Adeeb Rahman at Sinai
Clonal analysis of Salmonella-specific effector T cells reveals serovar-specific and cross-reactive T cell responses: https://flowrepository.org/id/FR-FCM-ZYW2 data from Evan Newell
Human Bone Marrow Assessment by Single Cell RNA Sequencing, Mass Cytometry and Flow Cytometry: https://flowrepository.org/id/FR-FCM-ZYQB has bone marrow, not blood, from healthy subjects, but looks pretty interesting. Also https://flowrepository.org/id/FR-FCM-ZYQ9 seems to have more data?

hammer commented 5 years ago

From Bertram, looks like the winner: An interactive reference framework for modeling a dynamic immune system (2015). Data on Cytobank (blah) at https://community.cytobank.org/cytobank/projects/733

eric-czech commented 5 years ago

Hey @hammer ,

FYI I've been exploring the data and software from that paper and found a few potentially interesting things. For one, it looks like the guy that originally built the software is now maintaining it as a set of separate repos that he seems to have put some serious time into. Conveniently, there's also this repo for reproducing data from the paper: https://github.com/ParkerICI/flow-analysis-tutorial.

After getting some of those examples to work and getting pretty close reproductions of some figures, I tried doing something similar with that other paper you mentioned above by Bertram (Epigenomic-Guided Mass Cytometry Profiling Reveals Disease-Specific Features of Exhausted CD8 T Cells). I went that route because they didn't share the human data in the first paper -- only the FCS files for the different mice strains are available.

After running the clustering (via grappolo), scaffold mapping (via vite), and generating my own landmarks for the CD8+ human data Bertram shared, I was able to get things like this out of the shiny app in panorama where I annotated and combined some screenshots:

tcell_exhaustion_landscape

Is a CD8+ specific analysis like this too specific for the book in your opinion given that exhausted phenotypes are still pretty controversial? I'd be happy to do something like this again with a less focused study, but all the same I thought it was interesting to see how much those maps shift across the different conditions.

hammer commented 5 years ago

@eric-czech pretty pictures! Let's go over in person? Dysfunctional states are definitely interesting to explore but it would be nice to have a broad, healthy subset as well to compare with other chapters.

eric-czech commented 5 years ago

Hey @hammer , I tried a few things with a couple more of those datasets you originally listed but wasn't able to get a result that I thought would make sense so I started branching out to look for some others. I bumped into this (a "most popular dataset" on FlowRepository), which looks pretty great if I'm not missing something: https://flowrepository.org/id/FR-FCM-ZYAJ

There is no associated publication that I can find but it's PBMC samples run for 21 healthy adults ages 23 to 64 and each FCS file has ~250k events that aren't already filtered in any way (just bead-normalized). The panel used had 30 markers and there was a flowjo workspace attached as well so it's easy to work with. Here's an example clustering for CD4+ T cells from a 50 yr old hispanic male where all the markers shown here are relevant to CD4 cell subtypes and about half of them are not available in OMIP-30 (to get a sense of the extra resolution here):

CD27, CD28, CD38, CD94, CD161, ICOS, HLADR, CD57, CD85j, and PD-1 are the T cell specific markers in the panel that seem most likely to add any kind of extra separation or enable a deeper classification.

Do you have any suggestions on how to best use them? And do you think they provide enough extra resolution or are there some crucial TFs or other surface receptors you think it would be good to look for on top of what's necessary to do the usual lineage/differentiation gating (that aren't in that list above)?

hammer commented 5 years ago

Hmm wonder if that data is from https://www.ncbi.nlm.nih.gov/pubmed/29174717/?

Sounds like a nice dataset! Wish they had done a more detailed T cell panel but ah well.

I won't have time this week or next to dig in but will check it out early February.

eric-czech commented 5 years ago

Mm could be related but all the data on FlowRepository for that paper has ~20 markers instead of ~30 (w/o many of the more interesting ones).

A final nugget then before leaving off it a bit -- here's some data I clustered from the typhoid fever paper (by Evan Newell) with some rough eyeballing on the cluster labels and colorings of the graph using what I thought were the markers that showed useful variation amongst naive cells. The star plot is impossible to read with this many markers but at least the legend shows those that are more unique to the (~43 marker) panel:

typhoid-fever-pbmc-flowsom

The downside of this data, which includes PBMC for 6 people before being infected, is that it's already filtered to CD4+ and there's no gating information. I can make due though if you think some of those markers are compelling. I also can't seem to find info in the paper or supplement on how old and/or healthy these subjects were at day 0. "Healthy" volunteers willing to let a bunch of scientists give them typhoid fever sounds like an oxymoron to me 😬

eric-czech commented 5 years ago

Alright one more update before moving on for a bit. After collecting what I mentioned in https://github.com/hammerlab/t-cell-data/issues/5#issuecomment-458201371 and looking for the datasets with the biggest relevant panels, I found two papers that seem to check all the boxes. Both are from Evan Newell in 2016 as well and include healthy blood samples with data that is not pre-gated to CD4/CD8 cells:

Optimization of mass cytometry sample cryopreservation after staining

Links: PubMed FlowRepository Panel: CD45, CD14, TCRγδ, CD3, HLA‐DR, TNF‐α, IFN‐g, MIP‐1β, IL‐8, CD8, CD45RA, CD19, CD4, CD103, IL‐2, CD25, CD107a, CCR7, CXCR4, Vd2, CD38, CD56, Integrin β7, CD279 (PD‐1), CCR9, CTLA‐4, CD40L, Vδ1, Vα 7.2, CXCR5, CD161, CCR2, IL‐4, IL‐10, CCR6, GM‐CSF, CCR4, IL‐22, CX3CR1, IL‐17A, CD16

A High-Dimensional Atlas of Human T Cell Diversity Reveals Tissue-Specific Trafficking and Cytokine Signatures

Links: PubMed FlowRepository Trafficking Panel (no stimulation): CD45, D14, CD57, TCRγδ, CD3, HLA-DR, CD29, CD38, CD69, CD62L, CD8α, CD45RO, CLA, CD4, CD103, CCR4, CD25, CD49a, CCR10, CXCR6, CD19, CD27, CD56, ICOS, PD-1, CD161, CCR9, CXCR3, CD95, CD31, CXCR5, CD49d, CCR2, Integrinβ7, CCR5, CCR6, CD45RA, CCR7, CX3CR1, CXCR4, CD127 Function Panel (PMA stimulation): CD45, CD14, CD57, TCRγδ, CD3, IFN-γ, TNF-α, IL-8, Granzyme B, IL-17F, CD8α, CD45RA, CLA, CD4, CTLA-4, IL-2, CD25, CD103, CCR10, CXCR6, IL-5, CD19, CD56, Integrinβ7, PD-1, IL-9, CCR9, CXCR3, CD127, Mip-1β, CXCR5, CD161, CCR2, IL-4, IL-10, CCR6, GM-CSF, CCR4, IL-22, CCR5, IL-17A

Let me know if you don't think those are on point @hammer.

hammerlab / t-cell-data

CyTOF data #29

Optimization of mass cytometry sample cryopreservation after staining

A High-Dimensional Atlas of Human T Cell Diversity Reveals Tissue-Specific Trafficking and Cytokine Signatures