RGLab / CytoML

A GatingML Interface for Cross Platform Cytometry Data Sharing
GNU Affero General Public License v3.0
29 stars 14 forks source link

Issue Parsing FlowJo (v10.7.2) XML File #138

Closed wfulp closed 2 years ago

wfulp commented 2 years ago

I am trying to parse a FlowJo (v10.7.2) XML file and can not get past open_flowjo_xml.

> open_flowjo_xml(file = xml_files$Path, options = 0, sample_names_from = "keyword")

Error in open_workspace(file, sample_name_location = match(sample_names_from,  : 
     document of the wrong type, root node != 'Workspace'

I tried both sample_names_from all 16 options, and get the same error.

Since the help refers to XML::xmlTreeParse I also tried to read it that way, and the XML file does not appear to be read properly. It's not in the same structure compared to XML files I tried from earlier versions of FlowJo.

FlowJo (v10.7.2) XML test file: https://www.dropbox.com/s/ytnw99d2ekzaka8/BCG%20correlates%20mock_workspace%20xml.xml?dl=0

When I look in the XML directly I see

<Keyword name="$FIL"  value="11Aug_21_20011_fixed_029.fcs" />

so I was expecting sample_names_from = "keyword" to be correct.

Relevant session info:

 CytoML        * 2.5.4   2021-10-01 [1] Github (RGLab/CytoML@93026c3)       
 flowCore      * 2.5.0   2021-10-01 [1] Github (RGLab/flowCore@d2c3145)     
 flowWorkspace * 4.5.3   2021-10-01 [1] Github (RGLab/flowWorkspace@a359921)

Thank you for your continued support.

mikejiang commented 2 years ago

It doesn't seem to be a properly exported workspace file
e.g. This is from the header of your file

<?mso-application progid="Word.Document"?>

Maybe try to ask the lab to reexport wsp file straight from flowJo?

wfulp commented 2 years ago

Thanks Mike, you are correct. I received confirmation from the lab they exported the wsp file in Word, so it ended up being an xml export of an already xml file. I'm waiting on the lab to send the new wsp file directly from flowJo, but I'm sure this is the issue.

wfulp commented 2 years ago

I have confirmed that with the updated xml file I can parse the workspace correctly. Thanks for pointing me in the right direction.