gs1 / EPCIS

Draft files being shared for EPCIS 2.0 development
Other
20 stars 7 forks source link

RegEx to help in determining 'highly constrained GS1 DL URIs'? #147

Open RalphTro opened 3 years ago

RalphTro commented 3 years ago

Hi @CraigRe , in sections 8.2.4.2 and 8.3.3.2, we specify that IF users apply GS1 Digital Link URIs in EPCIS, they SHALL be highly constrained, corresponding to each of the EPC schemes as defined in the EPC TDS.

I am wondering if we want to help the EPCIS community in providing a regular expression which can actually ascertain whether this is true or not (similar to how we did in the DL URI Syntax spec to recognise a GS1 DL Web URI in the first place).

Before we put too much effort in this, I just drafted the following RegEx for discussion purposes (note that it only applies for GTIN/lGTIN/sGTIN and only for GS1's canonical version of GS1 DL URIs at this moment of time):

^https:\/\/id.gs1.org\/((01\/\d{14}$)|(01\/\d{14}\/10\/([\x25\x28\x29\x2d-\x2E\x30-\x39\x41-\x5A\x5F\x61-\x7A]{0,20})$)|(01\/\d{14}\/21\/([\x25\x28\x29\x2d-\x2E\x30-\x39\x41-\x5A\x5F\x61-\x7A]{0,20})$))

It matches e.g.: https://id.gs1.org/01/04150567890128/21/987654 https://id.gs1.org/01/04012345123456/10/Lot987 https://id.gs1.org/01/04012345123456 https://id.gs1.org/01/04150567890128/21/abc_-110%2F5

But it does NOT match: https://example.com/01/04150567890128/21/987654 (another domain) https://id.gs1.org/01/4012345123456 (GTIN-13) https://id.gs1.org/01/061414155557 (GTIN-12) https://id.gs1.org/01/12345670 (GTIN-8) http://id.gs1.org/01/04150567890128/21/987654 (http instead of https) https://id.gs1.org/01/04012345123456 (space character at the beginning) https://id.gs1.org/01/04012345123456 (space character at the end) https://example.com/01/04150567890128/10/LOT987/21SER123 (GTIN, Lot, AND Serial) https://id.gs1.org/gtin/04150567890128/21/987654 (short name instead of AI equivalent) https://id.gs1.org/01/09780345418913/22/AB (CPV) https://id.gs1.org/01/04150567890128/21/987654?11=221110 (with AI in query string) https://id.gs1.org/01/09780345418913?abc=123 (with custom extension in query string) https://id.gs1.org/01/04012345123456?17=210122&abc=123 (dto) https://id.gs1.org/01/04150567890128/21/987654?abc=123 (dto) https://id.gs1.org/01/04012345123456/10/L7?17=210122&abc=12 (dto)

Kind regards, @RalphTro

RalphTro commented 3 years ago

For some reasons, e.g. "^https:\/\/i" in the RegEx is displayed as "^https://i"... Even here in the comment :-) (IT actually IS escaped => \ / \ / )

mgh128 commented 3 years ago

I expect that we'll have some further discussion on this on today's call - about how much flexibility to allow and how indexing can keep us 'pure' to the original idea of EPC being an instance identifier, e.g. (01)+(21), even in today's messier world of GS1 Digital Link URIs.

CraigRe commented 3 years ago

Having agreed on option "Y" (non-canonical DL URIs, but constrained to EPC correspondence), do we still need to discuss RegEx support?

RalphTro commented 3 years ago

IMO yes, but this could also become part of the Implementation Guide. If the group agrees that this would be useful, I can prepare sth. as a basis for discussion...

CraigRe commented 3 years ago

My view is that it would belong in the standard (or as an associated artefact), rather than as part of the Imp Guide.

mgh128 commented 3 years ago

I think the regular expressions for Option Y should appear in CBV 2.0. I can prepare these.

RalphTro commented 3 years ago

Great - I can help with this, @mgh128!

CraigRe commented 3 years ago

And I will support this, as well, @mgh128 and @RalphTro .

CraigRe commented 3 years ago

To be embedded in to section 8 of CBV; link to javascript validation tool.

CraigRe commented 3 years ago

Will add side-by-side comparison of URN and GS1 DL URI examples.

RalphTro commented 3 years ago

@CraigRe : I am wondering if the current CBV draft does not already contain the developed RegEx? If yes, I think we are able to close this issue after a brief discussion/confirmation in e.g. today's/next week's call.

CraigRe commented 3 years ago

https://github.com/gs1/EPCIS/issues/241#issuecomment-814199376

mgh128 commented 2 years ago

Absent from public review drafts but now noted in a public review comment as something that should be fully developed and included within section 8.1.2 of CBV, together with an explanatory table of GS1 Digital Link URI templates that are suitable as an alternative for the corresponding EPC pure identity URNs.