commontk / CTK

A set of common support code for medical imaging, surgical navigation, and related purposes.
https://commontk.org
Apache License 2.0
855 stars 490 forks source link

Standardise fileExtension across Slicer, GIMIAS, MitkWorkbench, MedInria #487

Closed MattClarkson closed 10 years ago

MattClarkson commented 10 years ago

Looking through command line modules in Slicer, lists of file extensions are written as ".nii,.nrrd,.nii.gz"

i.e. a comma separated list of file extensions that include the first dot.

Slicer probably has the most modules, and as I understand it, the xml specification originated there. So, I propose to stick with the Slicer format for all input and output image, file, geometry types.

This does therefore require changes in CTK and have knock-on effects to other apps (except Slicer).

pieper commented 10 years ago

Hi Matt - is there a downside for accepting both syntax options (i.e. with dots and without dots) to minimize effects on the other apps? Perhaps we could generate a deprecation warning if we want to ultimately accept only one syntax.

MattClarkson commented 10 years ago

At this stage Im still trying to narrow down a precise spec. Is there one that you know of?

The reason it bothered me in the first place was we kept flipping between two implementations. To work in slicer you needed to omit the asterisk. But the ctk xslt transform converted xml to .ui with an incorrect filename filter and so needed an asterisk.

Matt Clarkson. (Apologies for brevity. Im using my phone.)

Steve Pieper notifications@github.com wrote:

Hi Matt - is there a downside for accepting both syntax options (i.e. with dots and without dots) to minimize effects on the other apps? Perhaps we could generate a deprecation warning if we want to ultimately accept only one syntax.

— Reply to this email directly or view it on GitHub.

pieper commented 10 years ago

I tend to agree with your original concept that whatever the first convention was should be kept unless there's a good reason to change it. So in this case, the Slicer version, with no asterisk should be the spec. Can the xslt be changed add the asterisk to the extensions so that the .ui file is correctly formed?

MattClarkson commented 10 years ago

Hi Steve,

I did this on branch 487-standardise-fileExtension the other day. Its not merged to master yet. https://github.com/commontk/CTK/commit/24a27cd8a55a1e3daaeb6cc8f625697344c715b8

So the remaining questions are whether to make the xslt transform cleverer. i.e. generate asterisk if none exists. same question for the leading dot.

my knowledge of XSLT is pretty non-existent though.

My current understanding of the spec for fileExtensions is “no asterisks, and include leading dot”, just because thats what I grepped from the Slicer XML modules. This works in Slicer and GIMIAS, and with the changes above will also now work in Mitk Workbench and NiftyView.

I also did: http://sourceforge.net/u/mattclarkson/niftyreg/ci/master/tree/

branch b13-standardise-cli-fileExtensions

so NiftyReg works in Slicer, GIMIAS and Mitk Workbench. Florian also says MedInria does various post processing to correctly cope with [asterisk | no asterisk][dot | no dot]. So MedInria should also be ok, when that plugin goes live.

M

On 22 Jun 2014, at 17:14, Steve Pieper notifications@github.com<mailto:notifications@github.com> wrote:

I tend to agree with your original concept that whatever the first convention was should be kept unless there's a good reason to change it. So in this case, the Slicer version, with no asterisk should be the spec. Can the xslt be changed add the asterisk to the extensions so that the .ui file is correctly formed?

— Reply to this email directly or view it on GitHubhttps://github.com/commontk/CTK/issues/487#issuecomment-46785209.

pieper commented 10 years ago

Makes sense to me - then I guess your 487 is ready to merge?

On Sun, Jun 22, 2014 at 12:50 PM, Matt Clarkson notifications@github.com wrote:

Hi Steve,

I did this on branch 487-standardise-fileExtension the other day. Its not merged to master yet.

https://github.com/commontk/CTK/commit/24a27cd8a55a1e3daaeb6cc8f625697344c715b8

So the remaining questions are whether to make the xslt transform cleverer. i.e. generate asterisk if none exists. same question for the leading dot.

my knowledge of XSLT is pretty non-existent though.

My current understanding of the spec for fileExtensions is “no asterisks, and include leading dot”, just because thats what I grepped from the Slicer XML modules. This works in Slicer and GIMIAS, and with the changes above will also now work in Mitk Workbench and NiftyView.

I also did: http://sourceforge.net/u/mattclarkson/niftyreg/ci/master/tree/

branch b13-standardise-cli-fileExtensions

so NiftyReg works in Slicer, GIMIAS and Mitk Workbench. Florian also says MedInria does various post processing to correctly cope with [asterisk | no asterisk][dot | no dot]. So MedInria should also be ok, when that plugin goes live.

M

On 22 Jun 2014, at 17:14, Steve Pieper <notifications@github.com<mailto: notifications@github.com>> wrote:

I tend to agree with your original concept that whatever the first convention was should be kept unless there's a good reason to change it. So in this case, the Slicer version, with no asterisk should be the spec. Can the xslt be changed add the asterisk to the extensions so that the .ui file is correctly formed?

— Reply to this email directly or view it on GitHub< https://github.com/commontk/CTK/issues/487#issuecomment-46785209>.

— Reply to this email directly or view it on GitHub https://github.com/commontk/CTK/issues/487#issuecomment-46786100.

The information in this e-mail is intended only for the person to whom it is addressed. If you believe this e-mail was sent to you in error and the e-mail contains patient information, please contact the Partners Compliance HelpLine at http://www.partners.org/complianceline . If the e-mail was sent to you in error but does not contain patient information, please contact the sender and properly dispose of the e-mail.

jcfr commented 10 years ago

Converting a list of extension into name filters by adding an asterisk (if needed) makes sense.

Adding @finetjul to the thread, he did a lot of work with the qSlicerIO framework and also added some convenience function related to name filters. See https://github.com/commontk/CTK/blob/master/Libs/Core/ctkUtils.h#L50-76

Adding @saschazelzer, he worked on the XSLT infrastructure and may have some comments too.

MattClarkson commented 10 years ago

OK.

Im offline until Friday at least. Don’t merge branch 487 until Im back, as it needs more testing. See you soon.

M

On 22 Jun 2014, at 18:57, Jean-Christophe Fillion-Robin notifications@github.com<mailto:notifications@github.com> wrote:

Converting a list of extension into name filters by adding an asterisk (if needed) makes sense.

Adding @finetjulhttps://github.com/finetjul to the thread, he did a lot of work with the qSlicerIO framework and also added some convenience function related to name filters. See https://github.com/commontk/CTK/blob/master/Libs/Core/ctkUtils.h#L50-76

Adding @saschazelzerhttps://github.com/saschazelzer, he worked on the XSLT infrastructure and may have some comments too.

— Reply to this email directly or view it on GitHubhttps://github.com/commontk/CTK/issues/487#issuecomment-46787842.

MattClarkson commented 10 years ago

I updated branch 487-standardise-fileExtension, can someone review? I can only test on Mac at the moment. I updated CTK to cope with lists of *.ext, .ext and ext.

Thanks

saschazelzer commented 10 years ago

I can review the changes on Monday afternoon but I like what I have read so far. We can also make the schema more flexible by allowing an optional dot in the beginning (using regex restrictions) and require implementations to be able to cope with that. The XSD for generating .ui files can certainly be made more clever (I didn't check if you already succeeded with that).

MattClarkson commented 10 years ago

Well, I did this: https://github.com/commontk/CTK/commit/0fe540df9ef39da2a6beb93aa71a7846bbf69703

I also just ran it on Windows, Linux and Mac, and all seemed ok.

saschazelzer commented 10 years ago

This looks good +1 for merging.

We should also make the XSD file more expressive. Using xsd:token or xsd:nmtoken as the type for the fileExtensions attribute with a restriction pattern of \.?.+ or similar.

saschazelzer commented 10 years ago

Ups, the regex should probably read: ^[^*].+

As a note: Personally I favour using extensions without a leading dot since the dot does not carry any information and is redundant. The XSD restriction should allow it for compatibility though.

jcfr commented 10 years ago

Makes sense.

The dot character is consider a character of the filename. See https://en.wikipedia.org/wiki/Filename_extension

On Sat, Jun 28, 2014 at 10:06 AM, Sascha Zelzer notifications@github.com wrote:

Ups, the regex should probably read: ^[^*].+

As a note: Personally I favour using extensions without a leading dot since the dot does not carry any information and is redundant. The XSD restriction should allow it for compatibility though.

— Reply to this email directly or view it on GitHub https://github.com/commontk/CTK/issues/487#issuecomment-47428300.

+1 919 869 8849

MattClarkson commented 10 years ago

@saschazelzer : Please review. I just also added fileExtensionsType to .xsd.

So, the .xsd specifies a comma separated list of file extensions with optional (dot asterisk OR dot), and the .xslt to convert xml to .ui will add the necessary dot asterisk or dot such that the Qt file open dialog will have correct filters.

saschazelzer commented 10 years ago

Looks good, I would like to make some suggestions though:

Here is an example xsd snippet (note that my validator didn't like the line start and end delimiters in the value attribute:

<xsd:attribute name="fileExtensions">
  <xsd:simpleType>
    <xsd:restriction base="xsd:string">
      <xsd:pattern value="(?:(?:\.[^,\s]|[^,\.\s\*])[^,\s]*)(?:,\s*(?:\.[^,\s]|[^,\.\s\*])[^,\s]*)*"/>
    </xsd:restriction>
  </xsd:simpleType>
</xsd:attribute>

<xsd:complexType name="geometryType">
  <xsd:attribute ref="fileExtensions"/>
</xsd:complexType>
MattClarkson commented 10 years ago

All integrated. Although Id appreciate a breakdown of that regexp! M

On 30 Jun 2014, at 20:39, Sascha Zelzer notifications@github.com<mailto:notifications@github.com> wrote:

Looks good, I would like to make some suggestions though:

Here is an example xsd snippet (note that my validator didn't like the line start and end delimiters in the value attribute:

xsd:simpleType /xsd:restriction /xsd:simpleType /xsd:attribute /xsd:complexType — Reply to this email directly or view it on GitHubhttps://github.com/commontk/CTK/issues/487#issuecomment-47576609.
saschazelzer commented 10 years ago

The regex has the same structure as your initial one. The first non-capturing group (?: ) describes a valid token:

  1. The token can either start with a period character followed by another character which is not a comma or white-space \.[^,\s] or
  2. the token must start with a character which is not a comma, period, white-space, or asterisk [^,\.\s\*]
  3. the rest of the token is anything except a comma or white-space [^,\s]*

This leads to the regex for one token (?:(?:\.[^,\s]|[^,\.\s\*])[^,\s]*) which is repeated afterwards with an additional expression in front to check for the comma and trailing white-space as a token separator: ,\s*

jcfr commented 10 years ago

If that helps, here is a website that can be used when developing regex: http://rexv.org

On Tue, Jul 1, 2014 at 4:45 AM, Sascha Zelzer notifications@github.com wrote:

The regex has the same structure as your initial one. The first non-capturing groupe (?: ) describes a valid token:

  1. The token can either start with a period character followed by another character which is not a comma or white-space .[^,\s] or
  2. the token must start with a character which is not a comma, period, white-space, or asterisk [^,.\s*]
  3. the rest of the token is anything except a comma or white-space [^,\s]*

This leads to the regex for one token (?:(?:.[^,\s]|[^,.\s*])[^,\s]) which is repeated afterwards with an additional expression in front to check for the comma and trailing white-space as a token separator: ,\s

— Reply to this email directly or view it on GitHub https://github.com/commontk/CTK/issues/487#issuecomment-47631543.

+1 919 869 8849

saschazelzer commented 10 years ago

Nice. I was using http://regex101.com

Branch 487-standardise-fileExtension reviewed and merged.