bwipp / postscriptbarcode

Barcode Writer in Pure PostScript
https://bwipp.terryburton.co.uk
MIT License
462 stars 64 forks source link

Sample EAN-13 barcodes (at least)... aren't? #209

Closed ferdnyc closed 1 year ago

ferdnyc commented 1 year ago

The sample EAN-13 barcode used in the software documentation and the wiki, dating all the way back to when postscriptbarcode was hosted on Google Code, is and has always been 9 771473 968012.

However, the GS1 Standard (which both dictates and documents the mechanics of how EAN-13 barcode assignments and labeling are done) defines a prefix space "Used for demonstrations and examples of the GS1 system", and a code starting with 977 is not within that space. The prefix for demo/example EAN-13 codes is 952. The prefix 977, OTOH, is "Allocated to ISSN International Centre for serial publications".

I wonder if it would be a good idea, and probably worth the effort, to replace the various 9 771473 968012 examples in the code/documentation with a code starting with 952 instead, just so that it's clearly and definitionally a "sample" code not actually valid for trade?

Beyond EAN-13?

Presumably, some (even most?) other barcode standards also block out some form of explicitly-invalid/sample codes within their code space. (Much the same way RFC 1918 set aside the 10.0.0.0/8, 172.16.0.0/12, and 192.168.0.0/16 IPv4 ranges as non-routable private addresses, or IANA defines example.com, example.net, and example.org as permanently non-assignable domains to be used when presenting non-working examples of DNS hostnames.) So I also wonder (but have not looked into) whether postscriptbarcode's examples in those other formats fall within those reserved ranges, as well.

Update: Convert EXOPs and wiki examples for the following to 952-prefix:

Some background

The postscriptbarcode sample EAN-13 first came to my attention over a decade ago, when I discovered, after reading up on EAN-13 in the old Google Code postscriptbarcode wiki, that the Discogs database1 contained fifteen separate releases with the exact same barcode: 9 771473 968012. I recently revisited that phenomenon "on a whim"2, and found to my dismay that the list of 9 771473 968012 matches has grown to 106 distinct music releases, as of today, in Discogs' database.

This is in absolutely NO way the fault of postscriptbarcode or its documentation, nor is it either the fault or the responsibility of the Burtons or anyone else involved in maintaining postscriptbarcode. It is, as the saying goes, a thing that happened to postscriptbarcode, not something postscriptbarcode did.

But, clearly, the sample EAN-13 used in and by postscriptbarcode is being replicated in a lot of places, presumably through a combination of confusion, ignorance, laziness, and deceptiveness. A number of people are either knowingly or accidentally creating products labeled with an effectively-fake barcode identifier. They've acquired that identifier, no doubt, through a variety of means:

  1. by not editing the "default" barcode inserted by some software that uses postscriptbarcode under the hood
  2. by blindly/underhandedly reproducing the provided example found in the postscriptbarcode documentation
  3. by blindly/underhandedly grabbing and reusing the sample code from the wiki here
  4. or, by simply running a Google image search for "EAN-13" and grabbing a random image from the results. (Which do include 9 771473 968012, though it's surprisingly far down the list these days. A decade+ ago, it was at/near the very top.)

Given that unavoidable reality, if the fake barcode that found its way onto their packaging at least encoded an EAN-13 that the GS1 standard clearly defines as invalid, because it's in the prefix space explicitly set aside for example codes, that may help reduce the spread and/or impact of this phenomenon. (If nothing else, hopefully some POS and inventory systems will have implemented enough of the GS1 standard that they'll reject any 952-prefix barcodes outright. Which might prevent more items using fake barcodes from entering the retail stream.)

Notes

  1. Discogs is a community-driven compendium of release data for commercial audio recordings, in the vein of the IMDb for movies/television or the TCDb for trading cards.
  2. A clear sign that I need better whims, but that bug report belongs in a different repo entirely.
terryburton commented 1 year ago

I wonder if it would be a good idea, and probably worth the effort, to replace the various 9 771473 968012 examples in the code/documentation with a code starting with 952 instead, just so that it's clearly and definitionally a "sample" code not actually valid for trade?

Now that we have well-estabilished example ranges for GS1 data, this makes sense.

Presumably, some (even most?) other barcode standards also block out some form of explicitly-invalid/sample codes within their code space.

If only...

So I also wonder (but have not looked into) whether postscriptbarcode's examples in those other formats fall within those reserved ranges, as well.

For the special-purpose ranges that are delegated to other agencies we have generally had to resort to using example numbers provided in published specifications because the official response has been that no such example ranges exist. For example, in the case of ISBNs there wasn't even a public database of prefix assignments to national agencies until recently.

GS1 themselves are more transparent with assignment, including providing sample ranges. However, it has been hard to care too much to convert things over whilst the examples in the General Specifications typically ignore the example ranges.

(If nothing else, hopefully some POS and inventory systems will have implemented enough of the GS1 standard that they'll reject any 952-prefix barcodes outright. Which might prevent more items using fake barcodes from entering the retail stream.)

I think this is the salient point as this provides an opportunity for cleanup of the supply chain at the point of entry, i.e. during initial processing of GS1 AI based data. The bar has been pretty low regarding data validiation, however efforts such as this are starting to improve things: https://www.linkedin.com/pulse/gs1-application-identifier-syntax-dictionary-terry-burton/

There are of course other number systems (e.g. HIBC) that we provide very basic support for, and then others (e.g. MH10.8,2) where users must perform their own encoding using the generic symbologies. Patches or relevant information welcome for those.

terryburton commented 1 year ago

I've added a task list to your original comment and it will be addressed over time.

terryburton commented 1 year ago

I've also addressed the sample.ps file with the following: 912bcba2bbca5a076d41e2dca8772b3fe9ae0701

oehhar commented 1 year ago

Great initiative, thank you ! HIBC reserves the LIC "A999" for test purposes. It is registed in the LIC data base as "HIBCC DEMO ACCOUNT". Take care, Harald

terryburton commented 1 year ago

Thanks @oehhar.

ferdnyc commented 1 year ago

the examples in the General Specifications typically ignore the example ranges.

You noticed that too, huh? I admit I found it completely bizarre.

ferdnyc commented 1 year ago

I've added a task list to your original comment and it will be addressed over time.

Thanks, I made a small correction — the example space is 952, not 925. :grin:

ferdnyc commented 1 year ago
  • [x] upcacomposite
  • [x] upca
  • [x] upcecomposite
  • [x] upce

Not sure if those four are going to work — isn't a UPC code just an EAN-13 that starts with 0, with the leading 0 removed? There may be an example space for UPC, but I'm guessing it's not 952.

terryburton commented 1 year ago

https://www.gs1.org/standards/gs1-style-guide/current-standard#2-Applicable-to-all-GS1-documentation+2-18-GS1-prefix-952-for-examples

ferdnyc commented 1 year ago

Ooh, and right under, for UPC:

2.18.1 UPC prefix based example

When a UPC based example is required, GS1 US have made available the following, suppressible, GTIN-12: 012345000058

terryburton commented 1 year ago

Done.