digital-preservation / PRONOM_Research

26 stars 8 forks source link

New format: GraphPad Prism 5-9 #68

Closed karenhanson closed 2 weeks ago

karenhanson commented 1 month ago

This is my first attempt at submitting signature research, so I welcome any feedback that will help make things easier. Also want to check I approached version number correctly (see explanation below).

Format name GraphPad Prism

Version number 5-9 Note: The version for other GraphPad Prism formats in PRONOM align with the software version (1-3 and 4). I chose to follow that pattern here for consistency but I'm conflicted. Version 5 of Prism went from using binary formats to XML and they labeled the schema as version "5.0" - it is compatible with GraphPad Prism versions 5 and later. Although there were new XML schemas to correspond with the release of 6.0, 7.0, and 8.0, I have yet to find a single example of these other XML versions being used - I'm guessing some kind of default compatibility mode sets it to 5.0. So, if these XML versions do show up in the wild, they may only be compatible with the corresponding GraphPad Prism version. All this is to say - please let me know if it seems I should be using the XML schema version (5.0) and/or adding records for those other XML schema versions, I can do that by including a version number in the signature, currently I don't include it (example: <GraphPadPrismFile ... PrismXMLVersion="5.00"> could be recognized by adding {0-64}507269736D584D4C56657273696F6E3D(22|27)352E3030(22|27) to the end of my current signature.

Extensions pzfx

MIME/Media Type application/x-graphpad-prism-pzfx

Description GraphPad Prism is a program that combines scientific graphing and statistics for data analysis. The pzfx format can be opened only by GraphPad Prism 5 or later. The first part of the file contains all the data tables and info sheets in a plain-text XML format that can be viewed by other programs. After that there is information about results, graphs and layouts embedded in the XML as a binary format specific to the Prism application.

Format type Text (Mark-up)

Vendor GraphPad Software (https://www.graphpad.com/)

File format identification signatures Position Type: Absolute from BOF Offset: 0 Maximum Offset: 0 Description: Starts with standard XML header, <?xml version="1.0" (followed by either ?> or encoding attribute. After the header, the Prism tag <GraphPadPrismFile Value: 3C3F786D6C2076657273696F6E3D(22|27)312E30(22|27){2-32}3C4772617068506164507269736D46696C65

This format should have priority over XML 1.0 (fmt/101)

Relevant links, documentation, extra information Mimetype: https://www.graphpad.com/support/faq/mime-types-for-prism-files/ Signature: https://www.graphpad.com/support/faq/how-can-a-prism-file-be-identified/ General info: https://www.graphpad.com/guides/prism/latest/user-guide/pzf_vs__pzfx_file_format.htm Schema: https://www.graphpad.com/support/faq/prism-xml-style-sheet-and-schema-that-define-the-pzfx-format/ Lots of examples in Figshare: https://figshare.com/search?q=graphpad%20pzfx&itemTypes=3

Credit Portico

karenhanson commented 1 month ago

We discussed this in the PRONOM drop-in on July 18, 2024. @thorsted had done some research/testing and confirmed that 5-9 of GraphPad appear to all use version 5.0 XML when you save the files. We therefore agreed to leave the version number as 5-9 to correspond to the software version compatibility instead of making separate entries for the seemingly unused XML versions. Version 10 of GraphPad switches to a new format (.prism) that should also be added to PRONOM in the future.

thorsted commented 1 month ago

Thanks Karen. Yes, the stylesheet's for versions 5-8 from here, indicate all use the 5.0 version in the header. The new .PRISM format for version 10 is a ZIP container with a "document.json" at its root.