RGLab / CytoML

A GatingML Interface for Cross Platform Cytometry Data Sharing
GNU Affero General Public License v3.0
29 stars 14 forks source link

Gating issue from Cytobank ACS file #73

Closed JimboMahoney closed 4 years ago

JimboMahoney commented 4 years ago

Hi there!

I just imported the gates from a Cytobank ACS file and I think I'm seeing something similar to #28.

Please feel free to close this issue if you believe it's the same thing (or perhaps I misunderstand that issue).

In short, all gates seem to import OK, except those where the gate is positioned at the top left of the plot and overlap the Y axis.

In the below screenshot, you can see the hierarchy for the "Naive CD4" population.

image

The stats in CytoML are good until it hits the penultimate gate (CD4+). Both that gate, and any subsequent gates, contain a fraction of the cells they should.

The previous gate (CD3+) is fine, despite being "off-axis".

In addition, in the following screenshots:

image

The CD3- population (final gate), despite overlapping both X and Y axes, is fine.

image

Here, the Naive CD8 (final gate) population is also fine, despite being off the chart.

This leads me to think there is something particular happening with gates that are positioned in the upper left corner (see CD4+ in first screenshot).

I hope this is clear (ish?)

I've uploaded the ACS file below.

experiment_265121_Oct-30-2019_11-28-AM.zip

mikejiang commented 4 years ago

It is due to the incorrect compensation during cytobank_to_gatingSet Because in the gatingML file, compensation-ref is pointing to FCS image Thus the parser ignores the transforms:spectrumMatrix node in xml, instead, it tries to extract the spillover matrix stored in FCS files. Typically we apply the same matrix across all samples and so the first FCS file is currently used as the source for that matrix. Apparently, it gets the wrong matrix since the compensation control files are also included in this experiment and the first file happens to be the control file.

There 4 options:

We don't know how cytobank works behind the scene and I am leaning toward 3rd option because it doesn't break the existing logic of gatingML-based parser. Let me know if it is ok with you to skip control files for the final import.

mikejiang commented 4 years ago

Here is the gate after the patch, let me know if it looks good to you.

> ce <- open_cytobank_experiment("~/Downloads/experiment_265121_Oct-30-2019_11-28-AM.acs")
Unpacking ACS file...
> ce
cytobank Experiment:  Aria Training Sort 
gatingML File:  /tmp/RtmpNm6ewZ/file2b6230655679/experiments/265121/cytobank_gate_ml2_v2.xml 
    panel samples
1 Panel 1       5
> gs <- cytobank_to_gatingset(ce, panel_id = 1)
> library(ggcyto)
> autoplot(gs[1], "CD4+")

image

JimboMahoney commented 4 years ago

Hi Mike,

Ah, I see. So the reason the other gates I mentioned aren't affected is because they're either not affected by compensation (e.g. SSA vs CD3) or are so far out in the log scale (e.g. CD4 vs CD8, CD45RA vs. CD8) that they are not badly affected.

I can't suggest which option to go for (although the first one - use each FCS file) sound best to me as a novice, but your gate screenshot looks good.

Thanks again for super-fast responses!

jacobpwagner commented 4 years ago

Minor fix in 1d9751979abc3f25ee0d8b01f27bc3ee32e5635d, so make sure to grab that too @JimboMahoney