Closed hurlenko closed 3 years ago
I'm interested in supporting this use case. As I'm based in Europe, I'm not too exposed to UPC barcodes, but I've seen them on a few products here as well.
Is there such a thing as "regular" UPC-E barcodes, or are all UPC-E barcodes "compressed"?
My initial idea for how to robustly differentiate EAN-8 from UPC-E is by enabling Symbology Identifiers on the barcode scanner. EAN-8 seems to use the Symbology Identifier ]E4
, while I believe UPC-E might use the same Symbology Identifier as EAN-13 and UPC-A, ]E0
? If you could provide some photos of UPC-A and UPC-E barcodes, it would be handy for testing this.
In the cases where you cannot enable Symbology Identifiers for some reason, a possible strategy is to specify what order the parsers should be tried when calling biip.parse()
. That way, Europeans could prioritize EAN-8 while Americans could prioritize UPC-E.
Aside, it would be awesome if you could add yourself to https://github.com/jodal/biip/wiki/Users if you're using Biip in production :-)
Is there such a thing as "regular" UPC-E barcodes, or are all UPC-E barcodes "compressed"?
To be honest I have no idea 😅. My final goal is to store a product in a database using a barcode and have barcode to uniquely identify that product. For that I need all of the barcodes to be encoded consistently to GTIN-14. I realized that allowing both compressed and uncompressed versions of UPC-E may result in duplicates of the same product. After googling for some time that was all the information that I found.
I'm based in Europe, too so I think the only option is to google for sample barcodes 🙂.
As for the Symbology Identifiers and parsing priority - seems like those are the only options. According to this and this there's really no way to reliably differentiate EAN-8 from UPC-E. When scanning the barcode however, you do know the type of the barcode so you can convert UPC-E to UPC-A before any further processing.
So having the UPC-E <-> UPC-A conversion in the library would be nice but overall feel free to close the issue.
Aside, it would be awesome if you could add yourself to https://github.com/jodal/biip/wiki/Users if you're using Biip in production :-)
I will let you know once I have a working solution 🙂. Thanks again!
I've made a PR (#78) that adds support for UPC, including:
I've tested scanning the barcodes at https://www.barcodefaq.com/1d/upc-ean/#GTIN_Compliance with Symbology Identifiers enabled giving the following results:
]E00123456789012
, e.g. a EAN/UPC prefix and then a GTIN-13, just like EAN-13 barcodes.]E00023400000906
, e.g. a EAN/UPC prefix and then a GTIN-13, just like EAN-13 barcodes. In other words, it is expanded to UPC-A by the scanner hardware.]E412345670
, e.g. a EAN-8 prefix and then a GTIN-8.If your scanner behaves in the same way, EAN-8 and UPC-E mixups should not be a problem.
If you have another source of UPC-Es than a physical barcode scanner, like a product catalog, that you want to have converted to GTIN-14 for storage, I think the PR in its current state should work for you...
When an UPC-E is parsed the upc
field in the parse result is set. It is then expanded to UPC-A and also populates the gtin
field as a GTIN-12:
>>> biip.parse('02349036')
ParseResult(
value='02349036',
symbology_identifier=None,
gtin=Gtin(value='023400000906', format=GtinFormat.GTIN_12, prefix=GS1Prefix(value='002', usage='GS1 US'), payload='02340000090', check_digit=6, packaging_level=None),
gtin_error=None,
upc=Upc(value='02349036', format=UpcFormat.UPC_E, number_system_digit=0, payload='0234903', check_digit=6),
upc_error=None,
sscc=None,
sscc_error="Failed to parse '02349036' as SSCC: Expected 18 digits, got 8.",
gs1_message=None,
gs1_message_error="Failed to match '02349036' with GS1 AI (02) pattern '^02(\\d{14})$'."
)
As these are two representations of the same thing, their GTIN-14 representation is identical:
>>> biip.parse('02349036').gtin.as_gtin_14()
'00023400000906'
>>> biip.parse('02349036').upc.as_gtin_14()
'00023400000906'
When a GTIN-8 is parsed, the gtin
field in the parse result is set. In this case, the value is also a valid UPC-E, as can be seen from the upc
field. Since the gtin
field is already set by the parsing as GTIN-8, the UPC-E is not automatically expanded to UPC-A and converted to GTIN-12.
>>> biip.parse('12345670')
ParseResult(
value='12345670',
symbology_identifier=None,
gtin=Gtin(value='12345670', format=GtinFormat.GTIN_8, prefix=GS1Prefix(value='00001', usage='GS1 US'), payload='1234567', check_digit=0, packaging_level=None),
gtin_error=None,
upc=Upc(value='12345670', format=UpcFormat.UPC_E, number_system_digit=1, payload='1234567', check_digit=0),
upc_error=None,
sscc=None,
sscc_error="Failed to parse '12345670' as SSCC: Expected 18 digits, got 8.",
gs1_message=None,
gs1_message_error="Failed to parse GS1 AI (12) date from '345670'."
)
In other words, you're able to choose if you prioritize the UPC or the GTIN interpretation, and convert the one you prefer to GTIN-14 for storage:
>>> biip.parse('12345670').gtin.as_gtin_14()
'00000012345670'
>>> biip.parse('12345670').upc.as_gtin_14()
'00123456000070'
Please give the PR a spin and let me know if this covers your use case!
Wow, thanks for this awesome update! I've just tried it and it seems to be working perfectly fine, definitely covers my use case. I believe the issue is now resolved. Thanks a lot!
Thanks for testing! The UPC support is now out as part of the 1.1.0 release.
Hi @jodal,
First of all thanks for your awesome library. I would like to know if you have plans to support compressed UPC-E barcodes. As far as I know compressed version is not valid GTIN.
I might be wrong but it seems like the algorithm to calculate the check digit is either different than the one in GTIN or there's no such algorithm at all. When uncompressed UPC-E becomes valid GTIN-12 (it becomes a UPC-A barcode). But when compressed it is not valid GTIN-8, it must be expanded to become GTIN-12 (though it might look like it is valid, as with this example
02345673
which can be treated as GTIN-8).You can check here for GTIN compatibility https://www.barcodefaq.com/1d/upc-ean/#GTIN_Compliance. Note that in the given example they expanded
02349036
to00023400000900
which i believe is incorrect because the check digit must be6
.Here's a sample code to expand UPC-E to UPC-A - https://code.activestate.com/recipes/528911-barcodes-convert-upc-e-to-upc-a. I tested i on
02349036
and it gives023400000906
which is a valid UPC-A/GTIN-12. And here's how to compress UPC-A back to UPC-E - https://gist.github.com/corpit/8204456. Also works as expected.I think the biggest problem is to distinguish EAN-8 from UPC-E to know when to make the decompression.