nipy / nibabel

Python package to access a cacophony of neuro-imaging file formats
http://nipy.org/nibabel/
Other
654 stars 258 forks source link

XML/REC Philips format support #681

Open Roosted7 opened 6 years ago

Roosted7 commented 6 years ago

What is XML/REC?

Philips has made a successor for it's PAR/REC format that fixes many of it's shortcomings and should be easier to parse than it's old plain-text counterpart: XML/REC This XML/REC format has been used for quite some years in the Philips scanners already, but there are virtually no tools that read (or convert) this format.

Why?

I think it would be of great benefit if Nibabel would support reading from these files, and perhaps even saving back again to this format. Especially since XML/REC's are used in the PRIDE research tool of the Philips software that allows for automatic processing (analyzing, converting, shimming, custom pulses) on the scanner host itself.

How hard would it be?

The REC part of the XML/REC format is identical to it's PAR/REC counterpart. The header is in XML format (obviously) and contains roughly the same data as the plain-text PAR files.

My part in this

I would like to aid the implementation of this format and can provide:

Thanks for help!

Roosted7 commented 6 years ago

Here is the XML to PAR conversion script: Gist

This open data set already has quite a nice collection of files: link to repo

grlee77 commented 6 years ago

I had started to implement this awhile back, but never finished it. Let me see if I can find it and see how close it was to working and what may remain to be done. I had just used an example .PAR and .XML file from a Phantom scan done at our site and manually compared the field names, etc. so there may be some corner cases that were missed.

At one point, I had found the following Matlab script that converts XML to PAR and seems to be BSD 2-clause licensed. I have not tried it yet, though: https://github.com/xiangruili/dicm2nii/blob/master/xml2par.m

A XML to PAR header conversion Python script, provided by Philips itself that has an GPLv3 license

Thank you for providing a link to that utility. nibabel code is MIT licensed so we cannot directly include GPLv3 code here, but that could still be very useful in the meantime for users who want to convert to .PAR and then use the existing nibabel PAR/REC reader.

Roosted7 commented 6 years ago

Ah that is the original version of that Matlab script I referred to in my first message! Thanks for linking it here.

Did you manage to find your (unfinished) implementation? Even if you don't have time to finish it, I'm hoping I could do that then..

I've starting thinking about the implementation specifics and was wondering; how hard would it be to properly support higher dimension images? The */REC formats are capable of containing multiple 'cardiac phases' (or scantypes) in time in 3D, which results in 5 (or more) dimension. The practical use for supporting this is for example reading in B1 maps per channel, which have 5 dimensions. Currently Nibabel squashes 1 dimension into the other (which is not terrible if you know this is happening, especially since it complains about multiple TR's), but I would prefer to see this handled properly somehow.

What would be your recommendation on handeling this?

grlee77 commented 6 years ago

My initial implementation is in the following branch: https://github.com/grlee77/nibabel/tree/xmlrec

For a limited number of data types it should work in standard nibabel-style fashion via code such as:

import nibabel as nib
xml = nib.load(my_xml_file)
header = xml.header
data = xml.get_fdata()

I only tested it on a single XML/REC file from a 2D cardiac CINE experiment, but can try it out on some others in the near future.

This approach does not explicitly convert the XML file to a PAR file on disk, but does internally convert the XML fields to their PAR equivalents and then subclasses the existing PARRECHEader and PARRECImage classes. It may make sense in the future to just reimplement the classes to keep the native XML parameter names, etc, but that will involve more near-duplicate code between parrec.py and xmlrec.py.

Currently there are two limitations: 1.) The set of enum conversions is incomplete so it will likely currently fail to read a number of image types. (Actually you should still be able to fully parse the XML via parse_XML_header, but the XMLRECHeader class will fail to convert the names to their PARRECHeader equivalent enumerated values.) 2.)For some reason the existing load behavior requires the .XML/.REC to be both uppercase or both lowercase (.xml/.rec). It fails to find the matching REC file when the names are .xml/.REC.

Their are at least three things that need improving on this initial attempt if you are interested in contributing.

1.) expand the list of enums in enums_dict. I see there is a much more extensive list in the Philips python script you provided, but have not copied it due to licensing concerns. We should probably ask permission to just use the enum lists under our current MIT license. 2.) Fix the filename finding issue with mixed-case .xml/.REC. This will likely be fairly easy, but probably involves changes to nibabel outside of xmlrec.py itself. 3.) Adding tests.

Obviously various coding style improvements could be made as well. I don't regularly work with XML-formatted data, so there may be better/cleaner ways of doing the parsing.

If the approach in that branch seems generally acceptable we can open a PR here.

grlee77 commented 6 years ago

I've starting thinking about the implementation specifics and was wondering; how hard would it be to properly support higher dimension images? The */REC formats are capable of containing multiple 'cardiac phases' (or scantypes) in time in 3D, which results in 5 (or more) dimension. The practical use for supporting this is for example reading in B1 maps per channel, which have 5 dimensions.

The existing PAR/REC reader supports a strict_sort flag which does pay attention to several sorting keys including cardiac phase, ASL label type, image type (magnitude, phase, ...), etc. However, as you mention it does get eventually reshaped into a 4D volume. The parrec2nii command line script does have the option via the -i flag that will write out a .csv file with additional info explaining the ordering of the stacking in the 4th dimension. This could be used to reshape the data back to its original nd shape. It would probably be fairly easy to add an option to the PARRECImage class to expand data back to the original n-dimensional shape.

Roosted7 commented 6 years ago

Thanks for sharing your implementation!

I'll ask our Clinical Scientist from Philips if we're allowed to use the enums and perhaps if they have more documention about the XML/REC format they are willing to share. Supporting XML/REC in Nibabel will only benefit Philips indirectly (and perhaps directly)...

I'm not entirely sure if the internal conversion to PAR header method is something I would be keen on implementing further myself. Since I would like to be able to write XML/REC's as well as read them, I think the classes should fully be re-implemented to achieve properly this, right? Re-implementing it as a new format has the advantage that we don't need the enums_dict but can use everything as-is in the XML.

grlee77 commented 6 years ago

I agree that converting to PAR internally is not ideal, especially as Philips is moving away from that format and the XML may continue to evolve going forward. Also the XML-style enum strings such as 'T1' are more informative than an enumerated value when reading the header's image_defs. My current implementation does however have the benefit of borrowing all of the PARRECImage sorting, orientation code, etc. that has already been well tested.

I took this approach as a minimal effort, but hopefully robust way to make an xmlrec2nii script that functions basically identically to parrec2nii. I haven't made the xmlrec2nii script yet, but it should be a trivial modification from parrec2nii.

Since I would like to be able to write XML/REC's as well as read them, I think the classes should fully be re-implemented to achieve properly this, right?

Yes, that would be a nice enhancement. My XML/REC implementation and the existing PAR/REC implementation in nibabel are read-only. I am happy to help review if you come up with an implementation for this, but cannot personally spend much time on a more extensive implementation at the moment.

It would be good to hear from one of the core nibabel devs about the approach they would prefer to take going forward.

grlee77 commented 6 years ago

Just a quick note that I have updated my branch to add XML/REC support to parrec2nii.py and created anxmlrec2nii command line alias. The upper/lowercase file extension issue should also now be fixed.

effigies commented 6 years ago

Hi @grlee77, @Roosted7, thanks for this conversation. I've been staying out of it mostly because I have negligible experience with */REC images and was happy to leave the discussion to you.

As to whether to represent the XML header directly or convert it to PAR, I would probably have a slight preference for full support of XML, for the reasons mentioned above. The only real limit on this (and full PAR/REC support, for that matter) I think is contributor time, knowledge and interest, so if you're happy to provide a more complete implementation, I think we'll all be more than happy to see it.

That said, to the extent that there are commonalities, it may be useful to provide a common API to XMLRECHeader and PARRECHeader that these could both subclass from (to the extent that the useful commonalities are not covered by SpatialHeader).

Also, I haven't looked at any of the files you've already linked to, so apologies if I'm stating the obvious, but I would have a look at the implementations of GIFTIImage and CIFTI2Header, and their use of existing XML handlers in nibabel.xmlutils.

Please directly @ me if you want my input at any stage, but I anticipate leaving most of the review to @grlee77.

grlee77 commented 6 years ago

That said, to the extent that there are commonalities, it may be useful to provide a common API to XMLRECHeader and PARRECHeader

I think I can refactor things to make the current implementation in the branch work along these lines. We could then replace the XML->PAR conversion variant with a pure XML variant as it is completed, hopefully without having to change much outside of xmlrec.py itself.

I would have a look at the implementations of GIFTIImage and CIFTI2Header, and their use of existing XML handlers in nibabel.xmlutils.

Yes, I saw that after I had mostly finished. I have not looked into refactoring to use those utilities yet.

grlee77 commented 6 years ago

It seems the general consensus is to avoid renaming to PAR conventions or converting to the PAR enums. I agree with @effigies that keeping the same API as the PARREC classes is desirable and should reduce maintenance burden. Please see an initial PR along these lines in #683