gs1 / gs1-syntax-dictionary

GS1 Barcode Syntax Dictionary and Syntax Tests
Apache License 2.0
13 stars 1 forks source link

Invalid pairings / excludes for AI ranges between 3100-3695 #5

Open mgh128 opened 1 year ago

mgh128 commented 1 year ago

Figure 4.13.1-1 of the GS1 General Specifications defines the current set of invalid pairings of GS1 Application Identifiers.

Within that table in the current version of the GS1 Gen Specs, I don't see any evidence for invalid pairings for any GS1 Application Identifiers in the range 3100 - 3695 in the first or third columns of that table, so unless there is a ratified General Specification Change Notice (GSCN) at https://www.gs1.org/standards/genspecs/gscn_archive that updates Figure 4.13.1-1 we should take great care not to suggest invalid pairings within the Barcode Syntax Resource dictionary that are not explicitly stated in the GS1 General Specifications nor within a ratified GSCN that will be implemented in the next publication of the GS1 Gen Specs.

This issue concerns lines 129-181 of https://github.com/gs1/gs1-syntax-dictionary/blob/2023-07-04/gs1-syntax-dictionary.txt where we currently see entries such as 3100-3105 * N6 req=01,02 ex=310n.

Even the note (3) for Gen Specs Fig 3.2-1 for GS1 Application Identifiers in this range does not forbid the use of more than one AI within each range. The note is only concerned with pointing out that the fourth digit indicates the number of decimal places.

While I accept that it would be advisable not to use AIs within the same range together, e.g. don't use (3103) and (3102) together, such a rule is currently not officially stated in the Gen Specs table of invalid pairings and it is potentially problematic to formulate this as 3100-3105 * N6 req=01,02 ex=310n because if we pick one AI within that range, e.g. (3103), then the notation 310n is usually considered to apply to all AIs in that range, e.g. (3100)-(3105), without excluding any individual AI in that range that we happen to select.

Would ex=310n then be considered to apply also to the single AI within that range that we select? i.e. it's mutually exclusive with itself. This could cause confusion and we certainly don't want any implementations of the Barcode Syntax Resource to reject a single occurrence of AIs within each range within 3100-3695, where they are used appropriately and in accordance with the GS1 General Specifications, simply because of a misinterpretation of statements such as ex=310n

PetaDing-GS1 commented 1 year ago

Hi @mgh128 - thanks for flagging this. You're quite right, that the invalid parings for AI 3100-3695, as specified by lines 129-181 of the syntax dictionary, are not currently defined in the Gen Specs. I think it makes sense, that AIs from the same range e.g. 3103 and 3102 should not be used together. I believe the Syntax Dictionary does the right thing, but agree it should be aligned with GenSpecs - in which case, this highlights a gap in the invalid pairings table (TBC). I will check this out with the AIDC and Fresh Foods folk to see if this is known issue, or if there is some other logical reason for this omission from the Invalid Pairings table. @terryburton would you have any comments/thoughts on this, whilst i look into things on this side?

In terms of the notation ex=310n (as one example), it does not apply to a single occurrence of an AI in the range, only to combinations of AIs. Duplicate AIs are allowed but the values must be the same - if the values are different e.g., (3101)000050(3101)000060, this is invalid. Additionally, if one AI from the range is present e.g., 3101, the use of any others from the range are forbidden/invalid i.e., 3102-3105.

Please let me know if I've misunderstood anything at all.

mgh128 commented 1 year ago

Hi @PetaDing-GS1 Thanks for looking into this. That all makes sense and I agree with the intention, so thanks for checking whether the GenSpecs table should be updated to include this. I hope that the documentation for the BSR makes clear how excludes (ex) works when its value specifies a range such as 310n . If it does not already include the clarifications you just provided, it might be worthwhile to update the documentation to include these. Thanks again! I have been updating the JSON file for AAMS to add the most recent AIs so I'll also check for any misalignment with the BSR for those too. Making further progress on the AI browser and hope to merge in the multilingual translations today or tomorrow.

terryburton commented 1 year ago

@terryburton would you have any comments/thoughts on this, whilst i look into things on this side?

The invalid pairings table in the GenSpecs makes no attempt to be thorough, and is introduced as attempting to catch bad data that is frequently found in the field.

Historically the GS1 Syntax Dictionary has not restricted itself to aligning with the tables in the GenSpecs and therefore includes self-evident restrictions between associated AIs, e.g. those that conflict because they represent the same data and differ only by order of magnitude.

I see a significant number of instances of this exact form of bad data having been caught by a system that uses the GS1 Syntax Dictionary to reject bad data. So to avoid moving backwards I think that it would be better to update the GenSpecs to state such restrictions explicitly and also to improve the exclusive pairings table (on the basis that this is common bad data), except where we find valid use cases for any pairings under consideration.

I hope that the documentation for the BSR makes clear how excludes (ex) works when its value specifies a range such as 310n . If it does not already include the clarifications you just provided, it might be worthwhile to update the documentation to include these.

The workings are defined by the Barcode Syntax Engine reference implementation and described by the terse comments at the top of the Syntax Dictionary file itself. (The BSR and its artifacts being tools that are a means to an end, and not a formal specification.)

For the sake of brevity within the Syntax Dictionary header we did not to explain that an AI cannot be mutually exclusive of itself! However, I have now added a note to this effect in case implementors become confused. https://github.com/gs1/gs1-syntax-dictionary/commit/0344bf488ec9c7b9d271f687a762a85c2dd7c04e

That being said, the application of the rules expressed by the Syntax Dictionary is just a part of a system of checks that must be applied to GS1 AI data by any robust data validation implementation, together with sanity checks that can be applied in the absence of the Syntax Dictionary (or the Syntax Dictionary being out of date), such as same length as other AIs with the same two-digit prefix), repeated AIs having the same value, etc.

So there is always cause for implementors to review the reference code when working with the Syntax Dictionary for data validation purposes.

PetaDing-GS1 commented 1 year ago

Thanks @terryburton ! For your comments and adding the expanded explanation of "ex" to the Syntax Dictionary header. Agree that the BSR should not be limited, for the sake of alignment with the AI association tables in the GenSpecs. It is certainly helpful in highlighting gaps and room for improvement! In terms of next steps for this issue, whilst we may be able to update the invalid pairings table (subject to GSMP of course), I will also look to update the community user guide to improve our explanation on the relation between BSR and GS1 standards.