This is a reference implementation for a method to uniquely identify an EPCIS event as specified in the Core Business Vocabulary (CBV) Standard 2.0. The EPCIS Event Hash ID works syntax-/representation-agnostic and is based on hashing. This PROTOTYPICAL DEMO SOFTWARE takes an EPCIS Document (either formatted in XML or JSON-LD) and returns the hash value(s) of the contained EPCIS events representing a unique fingerprint thereof.
Working as expected, no known major bugs.
The implementation provided here is a prototypical reference implementation meant for testing against other implementations, but not meant for production. If you discover that this implementation does not conform perfectly to the algorithm description or contains any other bugs, please file an issue at https://github.com/RalphTro/epcis-event-hash-generator/issues .
The Hashing Algorithm described below is implemented as a Python script, including a command line utility which can be run directly.
The package is release on PyPI at https://pypi.org/project/epcis-event-hash-generator/ hence it can be installed via
python3 -m pip install epcis_event_hash_generator
For usage information run
python3 -m epcis_event_hash_generator -h
There are situations in which organisations require to uniquely refer to a specific EPCIS event. For instance, companies may only want to store the hash value of a given EPCIS event on a distributed shared ledger ('blockchain') instead of any actual payload. Digitally signed and in conjunction with a unique timestamp, this is a powerful and effective way to prove the integrity of the underlying event data. Another use case consists to use such an approach to populate the eventID field with values that are intrinsic to the EPCIS event - if an organisation captures an event without an eventID field (which is not required as of the standard) and sends that event to a business partner who needs to assign a unique ID, they can agree that the business partner populates the eventID
field applying this methodology before storing the event on the server. If the organisation later wants to query for that specific event, it knows how the eventID was created, thus is able to query for it through the eventID value.
EPCIS events have a couple of differences to other electronic documents:
This is why industry needs to have a consistent, reliable approach to create a hash value that is viable to uniquely identify a specific EPCIS event.
Notice that the algorithm described here provides a way of hashing an event. A signature scheme can be build using this hash, but the hash by itself does not yield a proof of authenticity/authorship. For example, a man in the middle attack can re-compute the hash after tampering.
For any algorithm that is to be considered a faithful hash of an EPCIS event, we require the following properties:
For hashing strings, well-established algorithms such as SHA-256 are available. The focus of this specification is the canonicalization of a pre-hash string representation of an EPCIS event, which can be passed to any standard hashing algorithm.
To calculate this pre-hash string, the algorithm requires to extract and concatenate EPCIS event key-value pairs to one string exactly according to the following set of rules:
xsd:dateTimeStamp
permits an unlimited number of decimal places to be expressed. If more than 3 decimal places are expressed, the 3rd decimal place SHALL be rounded up if the 4th decimal place is a digit in the range 5-9. For example, an xsd:dateTimeStamp
value of 2023-01-18T11:04:03.1415Z would appear in the pre-hash string as 2023-01-18T11:04:03.142Z .epc
in epcList
, bizTransaction
in bizTransactionList
, etc.) SHALL be sequenced according to their case-sensitive lexical ordering, considering UTF-8/ASCII code values of each successive character. A field name denoting a list (e.g. epcList
, bizTransactionList
, sensorElementList
) SHALL only appear once in the pre-hash string.quantityElement
in quantityList
, sensorReport
in sensorElement
), the latter SHALL be concatenated to a string (similar to the procedure specified above) and, if they belong to the same level, sequenced according to their case-sensitive lexical ordering, considering UTF-8/ASCII code values of each successive character..bizTransaction
or Source/Destination Type in source
), the type key-value pair (where the key is 'type' and the value the respective type attribute) SHALL follow the actual key-value before the alphabetical ordering takes place. 00 / 01 / 01 21 / 01 10 / 01 235 / 253 / 255 / 401 / 402 / 414 / 414 254 / 417 / 8003 / 8004 / 8006 / 8006 21 / 8006 10 / 8010 / 8010 8011 / 8017 / 8018
ILMD
elements, the latter SHALL comprise their key names (full namespace embraced by curly brackets ('{' and '}') and the respective local name), as well as, if present, the contained value, prefixed by an equal sign ('='). The resulting substrings SHALL be sorted according to their case-sensitive lexical ordering, considering UTF-8/ASCII code values of each successive character when they are appended to the pre-hash string.readPoint
, bizLocation
, sensorElement
, sensorMetadata
, and sensorReport
), they SHALL be added at the end of its enclosing parent’s regular fields. Apart from that, they SHALL be added to the pre-hash string similarly as specified in the previous step.Applicable for all EPCIS Event Types, i.e. ObjectEvent
, AggregationEvent
, TransactionEvent
, TransformationEvent
and AssociationEvent
.
Sequence | Data Element |
---|---|
1 | eventType |
2 | eventTime |
3 | eventTimeZoneOffset |
4 | epcList – epc |
5 | parentID |
6 | inputEPCList – epc |
7 | childEPCs – epc |
8 | quantityList – quantityElement (epcClass , quantity , uom ) |
9 | childQuantityList – quantityElement (epcClass , quantity , uom ) |
10 | inputQuantityList – quantityElement (epcClass , quantity , uom ) |
11 | outputEPCList – epc |
12 | outputQuantityList – quantityElement (epcClass , quantity , uom ) |
13 | action |
14 | transformationID |
15 | bizStep |
16 | disposition |
17 | persistentDisposition - (set , unset ) |
18 | readPoint – id |
19 | bizLocation – id |
20 | bizTransactionList – bizTransaction (business transaction identifier , business transaction type ) |
21 | sourceList – source (source ID , source type ) |
22 | destinationList – destination (destination ID , destination type ) |
23 | sensorElementList - sensorElement ( |
sensorMetadata (time , startTime , endTime , deviceID , deviceMetadata , rawData , dataProcessingMethod , bizRules ), |
|
sensorReport (type , exception , deviceID , deviceMetadata , rawData , dataProcessingMethod , time , microorganism , chemicalSubstance , value , component , stringValue , booleanValue , hexBinaryValue , uriValue , minValue , maxValue , meanValue , sDev , percRank , percValue , uom , coordinateReferenceSystem ) |
|
) | |
24 | ilmd – {ILMD elements} |
25 | {User extension elements} |
For better understanding, the following illustrations include the data content of EPCIS events (including a couple of user extensions - all defined under 'https://ns.example.com/epcis'), show the corresponding pre-hash string as well as the canonical hash value of that event.
Example 1:
Run epcis_event_hash_generator/main.py tests/examples/ReferenceEventHashAlgorithm.xml -pj "\n"
to get a similar output of the pre-hash string and epcis_event_hash_generator/main.py tests/examples/ReferenceEventHashAlgorithm.xml
to verify the hash.
Example 2:
Run epcis_event_hash_generator/main.py tests/examples/ReferenceEventHashAlgorithm2.xml -pj "\n"
to get a similar output of the pre-hash string and epcis_event_hash_generator/main.py tests/examples/ReferenceEventHashAlgorithm2.xml
to verify the hash.
Example 3:
The line breaks in the pre-hash string are displayed for readability reasons. The actual pre-hash string does not contain any whitespace (unless specifically used in a value) and the lines displayed in the above picture have to be concatenated (by empty string) in order to get the actual pre-hash string.
This algorithm has various potential areas of application:
That said, the algorithm has limited applicability when EPCIS events are redacted (meaning that, e.g. for privacy reasons, EPCIS events are not shared entirely, but deliberately omit specific fields or including readPoint IDs with a lesser granularity - see EPCIS and CBV Implementation Guide, section 6.7). In such a case, the content of a redacted EPCIS event will in no case yield to the hash value of the original one.
The following table lists, in alphabetical order of their GitHub profile name, all persons who have contributed to this project so far through:
All of this was and is both very valuable as well as very much appreciated and we would like to take the opportunity to express our gratitude for all this valuable support.
Copyright 2020-2023 | Ralph Tröger ralph.troeger@gs1.de and Sebastian Schmittner schmittner@eecc.info
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.