Closed whitews closed 5 years ago
CytoML officially only supports two output xml formats: flowjo
, cytobank
, neither is standard gatingML (strictly speaking). The APIs you are using are private and only designed and dedicated for one of these two specific formats.
Before we devote our resource into the standard generic GatingML
support, can you please explain why do you need it and what is the context of your use case?
Hi Mike,
Yes, I realize I am using private functions, which is not ideal but the only way I can get valid GatingML from a FlowJo workspace. I'm also aware that neither a FlowJo workspace nor the cytobank XML are standard GatingML documents.
I develop and maintain Python libraries for working for flow cytometry data, and our group plans on creating analysis pipelines starting with a base set of gates that are manually created. Our main library supports GatingML 2.0 since FlowJo's XML is not open and there is no XSD available (that I'm aware of) to validate those documents. Plus, as you are probably well-aware, reverse engineering their format would require a significant amount of effort...the RGLab suite of libraries seems to be the only tool available that does read their workspace files well.
I am considering support for the Cytobank XML format since it is very close to GatingML, but it seems there is no XSD for that format either. Also, the Cytobank XML output from openCyto (or is it cytoML?) produces invalid XML. I am in the process of making a converter for these files to make them valid XML, and have also created my own internal XSD for that format. It would be nice if this XSD could be hosted somewhere, I'd be happy to provide it. I would also be glad to have a go at creating a pull request for the appropriate RGLab library to make the Cytobank export valid XML...the changes necessary are rather straight-forward.
Since we won't support standard GatingML2 at the moment, your plan for working with the existing CytoML output sounds reasonable. Feel free to submit the PR if it is just minor simple additions. But we won't be able to accommodate the changes if they get too intrusive.
Okay, sounds like a plan. Do you have guidelines for creating PRs (inclusion of tests, code style, etc.)?
We basically follow ropensci
guide lines, PR at least should pass all the existing test cases and adding new tests as needed (especially for new features)
Closing this since GatingML is not officially supported.
I'm using CytoML to convert FlowJo workspaces to GatingML 2.0. However, somewhere in the process, the gate names that are present in the FlowJo workspace get converted to what appears like automatically incremented gate IDs. This makes it difficult to track the original meaning of the gates in the hierarchy.
For example, here is a gate in the original FJ workspace:
And the resulting GatingML gate:
You can see that the gate name "FSC-A, SSC-A subset" gets converted to a gate ID of "gate_1_1".
Here is the code I am using for the conversion:
Is there any way to retain the original names in the GatingML file? I know the gate names are repeated in the FlowJo workspace because of multiple samples & sample groups, but it seems the automatically incremented gate IDs could be concatenated with the original names to ensure their uniqueness and retain their context in the gating hierarchy. Or, even better would be to only perform the concatenation if non-unique names are found in the resulting GatingML output.