Barcodes - again. - Githubissues

rco3 commented 4 years ago

I've recently offered input, etc. regarding the use of barcodes with Inventree. Lately I've been quiet on that front (which may have been welcome!) It's because I've been off trying to put a box around the barcode problem so that I could start working on incorporating some useful functionality. I'm going to share here what I've learned so far, what I see as being challenges, and how I propose we (ok, you) consider handling barcodes.

Existing labeling on distributor packaging of electronic parts

Some distributors use a single 1D barcode; some use an array of 1D barcodes. Some use a single 2D barcode; some use a combination of a 2D barcode and an array of 1D codes. Some have no barcodes, and thus represent a different problem to be dealt with elsewhere.

2D barcodes used

When a 2D barcode is used, it is typically an ECC-200 DataMatrix code. These contain multiple fields of information. There is also significant error correction, rendering the chances of a mis-read of the barcode or any field within it as infinitesimally small. DataMatrix codes have all (in my surveys to date) been encoded using the ECIA standard field prefixes as listed in EIGP 114.2018. Notably, none of the standard fields defined by the standard include the distributor who packed the package! PDF417 is also used, but AFAIK only by DigiKey for a transitional period between 1D and DataMatrix.

Single 1D code

A package with a single 1D barcode may be a Digi-Key package older than a certain date (not sure when); these can be submitted to DK through their API to retrieve additional information. However, it may NOT be a Digi-Key barcode; there is no unique identifier proclaiming such, and if we are using a keyboard wedge scanner we won't get the metadata to know whether it was a Code 128 or Code 39 barcode. Packages from Future, as an example, have a single barcode as well. I don't know of a way to positively identify which vendor produced a bag just by parsing its barcode.

Multiple 1D codes

For packages with multiple 1D barcodes, several pieces of information are typically presented, one per barcode. Unless the distributor chose to prefix the data which is encoded into the barcode with an ECIA-style prefix or equivalent identifier, I don't see a way to positively identify (e.g.) which field is MPN and which is invoice number. Some distributors do provide those prefixes: within my survey Mouser, NAC Semi, and Arrow do not, but Bel (mfg, not dist) provided them on a package for an SFP module, and both Kingbright and Kemet also provide field IDs on their 1D barcode arrays.

In an array without field identifiers, it might be possible to identify the distributor and format by looking at the layout of the barcodes, e.g. five barcodes in a single column, all Code 128, spaced 1.5x their height, might be a sufficiently unique signature to say "This is an Arrow bag, and we can map barcodes in order to the following fields:.." but in my experience this tends to fail on a couple of levels. One, it's hard to be certain that you've captured all the barcodes. Two, you have to be capturing with OpenCV and pyzbar or an equivalent method, so that you can get codes types, locations, etc to decipher the mapping / layout. Three, the number and placement of the barcodes changes even if its the same vendor - I have Mouser bags with 5 barcodes, and some with 9, and even the same fields show up in different places in the layout. Fourth, there is insufficient error correction in the 1D codes and so you can get erroneous readings. So, I just don't see this as being a robust and viable method.

DataMatrix is awesome

Where there is a DataMatrix code available, I believe that should be our primary source of data for ingestion from the bag. The error correction and field identifiers dramatically simplify the process of collecting data and assigning meaning to it. The only drawbacks that I see are 1. no identification of the distributor, and 2. which fields are present in any given datamatrix can be variable. I have, e.g., a Mouser bag which was purchased by a CM doing a turnkey build for us. It includes a Customer Part #, which is in this case the CMs build number followed by the first refdes to use that part. For this package and other similar cases, there is NOT a field in this datamatrix which contains MPN, but there is a 1D barcode that does.

SO, what to do?

I think there are a few ways to ingest data from the above data sources.

For packages without DataMatrix but with multiple 1D codes without field IDs, we should probably use a handheld wedge and manually scan individual codes into the appropriate fields. This provides us with improved resistance to typos and faster entry, but still requires a person to perform the mapping manually in real time by selecting the entry field and the barcode to scan into it. This method really doesn't need any barcode parsing or other custom code, just bang the data where you want it one piece at a time.

For packages with a single 1D code, I think we will need to require that the user identify the vendor and select an appropriate API endpoint, or else handle all of the parsing and so forth in a client-side app which provides data in an Inventree-friendly, non-vendor-specific format. I don't think we can auto-ID the distributor in this case, so we need operator guidance.

For packages with DataMatrix, I think we should use the data in the code itself directly as much as possible, and reserve using the remote API for situations in which the required data simply isn't present and the distributor ID can be manually determined. I have found DataMatrix using the ECIA field IDs on Digikey, Mouser, Newark, Kingbright, and Kemet so far. All use Format 06, identified with ' [)>{rs}06 ' for the first two fields where {rs} is a Record Separator character between the fields. The exception is Mouser, who uses the same format but use '>[)>06' for the header; I believe this is a misspelling of the intended header, confusing Hex 29 with ASCII 29. So, except for Mouser and NAC, I don't see a robust way to identify Distributor just from reading barcodes or DataMatrix codes. But we can extract a large number of other fields directly:

Format 06 field codes

Field Name	Data Identifier
Ship From	n/a
Ship To	n/a
Customer PO	K
Package ID (Intermediate Label)	3S
Package ID (Logistic Label)	4S, 5S
Packing List Number	11K
Ship Date	6D
Customer Part Number	P
Supplier Part Number	1P
Customer PO Line	4K
Quantity	Q
Date Code	9D, 10D
Lot Code	1T
Country of Origin	4L
Serial Number	S
BIN Code	33P
Company Logo	n/a
Package Count	13Q
Revision number	2P
ECCN	n/a
Weight	7Q
Manufacturer	1V
RoHS/CC	E
Reel ID	n/a
Moisture Sensitive Level	n/a
Moisture Barrier Bag Seal Date	n/a

I've also found a 3P field, which on a Newark bag mapped to SKU. Digi-Key use their SKU for customer P/N if you don't provide one (P); I also found a J field, which isn't listed on the ECIA format, on a DK pre-pack package.

Given all of the above (sorry), it seems to me that the idea of automagically hoovering up all the content on barcodes on a given package and converting that into Inventree data is optimistic. I suspect that this entry process will be a combination of automating as much as can be done robustly, and depend on the operator to fill the gaps. For DataMatrix extraction, I believe that we need to pass a dict to Inventree which contains either the raw DataMatrix scan including all the control characters, or else a list of fields per ECIA and their contents. Using a handheld scanner presents the problems discussed above regarding metadata, etc. but also adds another complication: it's difficult to know what those control characters will look like. I have three wedges, two bluetooth and one wired. Only the wired one gives me control codes; the other two (usually) replace them with ASCII characters '29' instead of the ASCII character #29. That may be entirely due to their interaction with my OS (Catalina), but has so far defied my efforts and the vendors' as well to provide those control characters for error-free field delimiting.

Proposed Path

Accordingly, I believe the best way of extracting data from packages is either a camera-based system whose decoded output and metadata can be sent to Inventree for parsing, or else manual capture by scanning codes directly into fields as if typing them. I've been struggling for a few weeks now to try to follow the path you've set of validating to a given distributor, but I just don't see how. I think this strategy may be ripe for re-examination.

Inventree QR codes

Having spent all this time understanding how the industry uses barcodes and what data we can extract from them, I strongly believe that the use of Inventree's own QR codes is vital to efficient management of the inventory. I still agree that we can use a hash of whatever data is already present to generate a UUID, but I'm not sure that's really the best way. 1D barcodes remain subject to misreading, 1D arrays can easily drop one member, so I think it's really only useful to use existing barcodes if they have some sort of error correction. DataMatrix codes do but they are large and contain extended characters which may or may not come through the same way if using a different scanner than originally hashed from. Inventree's QR codes tick all the boxes: error correcting, ASCII only so reads the same from anywhere and can be used with a keyboard wedge, contains exactly the information we want in the form we want it. So I think we use whatever methods we can to automate and simplify the gathering of bag data, and then promptly put our own label on it as part of the stock addition process.

Now that I'm at this stage with the background research, I'm going to move forward with trying to incorporate these processes into a branch of Inventree. I don't want to hijack your development path or plans or anything like that - please let me know if you object, have comments, suggest improvements or alt strategies, whatever.

Thanks yet again for this project.

SchrodingersGat commented 4 years ago

Hey @rco3 thanks for this insanely in-depth work! I have not had an opportunity since our last discussion to look into barcodes any further (although barcode scanning is now working in the Android app!).

Firstly, can you provide some resources / links to where you are getting the information on the barcode encoding?

I wholeheartedly agree that barcode scanning is going to be a key component in making InvenTree useful - and easy to use!

Following are a few use-cases I have come up with for barcodes

Tracking an individual stock item through acceptance testing and commissioning - various test stations could read the QR code and record test results against the stock item
Accepting incoming goods (here, reading third-party barcode formats would be very useful)
Tracking stock items after incoming goods inspection (perhaps using a third-party barcode as a UID)
Stock manipulation actions
- Assign stock items to a sales order
- Move stock items into a particular location
Other uses?...

I think that using a camera-based solution which can recognize multiple barcodes sounds quite complicated, and a solution which uses a barcode scanner would be more user-friendly. But, as you point out, where a package contains multiple barcodes, then the user needs to determine which barcode contains which data.

I would be interested in picking a small part of this to implement first, and see how we go. Do you have an idea of where you might start working on this? Any barcode scanning functionality would need to have the matching UI elements created.

I assume that the handheld barcode scanners act like a keyboard device - i.e. the data is presented as if the user had typed it on a keyboard.

If this is the case, how does this data get input into the webpage? I'd need to look into this. Do you have any experience with web programming / javascript? I wonder if there is a "standard" way of implementing something like this.

rco3 commented 4 years ago

Thanks for reply.

Most of my understanding of the contents of DataMatrix came from reading barcodes and comparing the contents of their fields with the human-readable data on the same label. Then I found a copy of EIGP 114.2018 which listed all those fields, confirmed some of my guesses, and added a few more I hadn't run into yet.

I spent a fruitless couple of weeks trying to derive unique and persistent patterns in the layouts of 1D barcode arrays, only to see variations that render the idea meritless.

On to implementation and usage:

I am insanely desperate for support for each of your mentioned use cases. I'm liking the first one in particular, especially if there is a camera involved: my thought was to point a camera at the DUT on the test jig and take a barcode scan every time a test is started, every time it finishes, and even between steps if there are multiple steps. Same idea, I wasn't assuming multiple test stations, but we're both thinking that automatically identifying the DUT is cool.

The acceptance of incoming stock is a particular concern of mine, with additional emphasis on the initial population of a new Inventree database from an existing inventory. That's one reason I've been trying so hard to ingest multiple styles of barcode labeling, but as I've tried to detail above it's the details that get ya.

Where I have assemblies consisting of multiple serialized parts, I'd love to be able to track (scan) the serial numbers of parts used in the build and add them to the build sheet of the top-level assembly. The hashing of distributor or manufacturer barcodes for a UUID might be really good here, some parts already have barcodes on them from factory.

First Steps

Where to start?

I propose that the first thing we need to do is get Inventree QR codes into a printer (with text for people) and thence onto bags.
The second is to be able to scan those codes for various tracking purposes. Android app is a good step, into Inventree via web interface is critical in my view.
Third would be (IMHO) to add the capability to decode and parse DataMatrix for stock intake and UUID. We could potentially use the same API and /or onscreen text entry field for these two codes - differentiation isn't hard, here. QR codes won't have a unique prefix, but they will have fields that can be directly read without need for further classification. DM codes do have a unique identifier as such: we can make a fairly strong assumption that if the barcode string starts with '[)>' or with '>[)>06' that it's a 2D barcode with distribution information encoded per EIGP 114.2018, and then we know how to decode.

A simple text field into which a wedge is scanned would work, assuming we can read or infer field codes. Yes, it acts just like a keyboard. For a camera-based system, I think the only way to handle that is with an API endpoint expecting a dict or some type of multi-field data structure containing the raw barcode, with known field delimiters, and some additional metadata - basically like the intermediate product Maholli used for passing data from his camera capture module to the API interface module. We could also simply put the entire camera and OpenCV chain into Inventree and accept still images to decode... that opens up a LOT of processing capability, but adds server load.

For Stock intake, the use of the 'K' field could allow us to automatically associate a Stock Item with a Purchase Order, for verification of goods received or some such feature.

Oliver, I have far less experience programming for web and with Javascript than I do in Python, which I am picking up as I go along. I gotta go to work now, let's follow up and refine what we want to do here. Feel free to suggest a more direct venue for this conversation if you feel we're clogging up the Issues list.

rco3 commented 4 years ago

Use Cases

Let's assume that we have a large text field on the front page into which one can scan barcodes. Maybe it could be a text search entry field as well? Anyway, we get what we think is a barcode, and we validate it as a barcode. Now what?

Depends on what kind of barcode. If it's an Inventree QR code (IQR), then obviously we are highly likely to already know about it... unless someone brings a barcode from their Inventree database over to ours - hey, is there a unique database ID in the IQR?. Well, assuming it's really 'ours', we already know what the Stock Item is. So, we have a few options to deal with items we have in the system. I'll list what I can think of that a person holding a package might want to do with it:

'Take Me Home': Return this package to its (default, most recent) storage location. Optionally, ask to verify actual quantity being returned. Optionally, light up LEDs in appropriate storage area, direct laser beams at storage location, play appropriate walk-in music, smoke machines, suggest HazMat suits, whatever.
Check Out: remove a package or quantity of parts from their storage location, for prototyping or other situations in which the number of required parts may not be known. Associate with a user would be nice.
Stock Count: Count parts in this package
Stock Remove: If not associated with an existing Sale or Build, then this allows for the removal from stock of a quantity of this Stock Item for other reasons (which operationally should probably be noted at the time of removal)
Stock Add: The functionality is obvious, the use case less so; wouldn't we ordinarily associate a given Stock Item with a particular Build or Buy transaction, Invoice, PO? And a new purchase/Build should have a different IQR by dint of being a separate Stock Item, esp if serialized? Otherwise, if this is returned Stock it should be handled by 'Take Me Home' from a user POV.
Serial Number retrieval: decode if embedded, retrieve from dB by reference if not.

Now, what if it's a Format 06 DataMatrix per previous comments?

Assume that we have validated as such and have identified field codes we recognize; next?

Hash the code and validate against the list of previously-captured and associated DMs. Assuming that we're trying to track inventory here, if we've previously scanned and ingested this package and printed an IQR then we should associate the hash with that Stock Item, which I believe is already supported. If we find it, we follow the above process / option list and the problem is now solved. Offer to print a new IQR label since the old one done fall off. Maybe suggest flipping it around to see if the IQR is there first. (Note to self: don't put IQRs on the bottom of open tubs.) If not...
We have a new friend! This package is new to our system. We need to ingest it. Break down the message using field delimiters previously identified. Use the reference list above to assign DM fields to Inventree fields. We will need to create a new Stock Item; we may also need to create a new Company Part (Vendor SKU), possibly new Company (Vendor), and perhaps even a new Part and maybe even Part Category. Ideally, the ingestion process would search for existing matches to all these and prepopulate exact matches, offer options for fuzzy matches. There is a fair amount of plumbing to be done here, but most of the information is usually in the DM in various well-ID'd fields. The notable exception, of course, is Company (Vendor) which is not even a field in the standard. User will have to select and maybe create. Extended information like Description, Category, various Parameters like Resistance, etc. can all be acquired from many vendors' APIs...but the legality of storing that data in our database is questionable. I would be most hesitant to store DK Description fields this way, for example... but maybe using info gathered this way to inform or at least suggest Category would be OK.

Oh, and associate that DM hash with the new Stock Item.

Ultimately I don't think we can trust any fully-automated process to ingest, I believe we have to have a person to be the final approver. Especially if we offer fuzzy matching, we don't need to allow ADA4877-1ARZ to be confused with ADA4877-1ARMZ - somebody will be upset eventually.

SchrodingersGat commented 4 years ago

This python package looks interesting - https://pypi.org/project/pylabels/

I was considering how to do "generic" labels over the weekend and pretty much every concept I thought of is covered here.

However "generic" is often the enemy of "easy". I want to support different styles of label printing but also it should be easy to use - most users will not want to have to muck around with custom code, etc.

Consider the different types of label "printing" available - for example at work I have a P-Touch label printer, and also print to Avery-style labels with a regular printer. Then there's also systems such as thermal label printers.

What I am imagining is that we can create label "templates" - and the user could (for example) select which template they want to use in their settings file. I'll have a play around with the pylabels module and see if it is going to meet the requirements.

rco3 commented 4 years ago

Had a quick look at pylabels. Lotsa good stuff there. Good support for Avery-style, including printing on partial sheets. Didn't see much explicit support for roll-style label printers like my Dymo 450, but I didn't look very deeply yet. Maybe the use of such one-at-a-time type label printers simply requires the definition of a 'sheet' the size of a label.

Looks like it depends heavily on ReportLab library for the actual drawing.

rco3 commented 4 years ago

Oliver, you are KILLING it. Chapeau.

SchrodingersGat commented 4 years ago

Glad you approve :) Have a look at some screenshots over in the other thread. It is starting to come together nicely.

rco3 commented 4 years ago

OK, I still don't know of a way to validate a barcode as being a Digikey barcode, but I know how to validate a DataMatrix in Format06 as such, whether it's Digikey's or Mouser's or whoever's version. Plus, I think we can auto-adapt to whatever control or escaped characters a given wedge returns for the Group Separator and Record Separator characters. Digikey will want those escaped: "Please ensure the Record Separator character is encoded as \u241E and the Group Separator ... character is encoded as \u241D" for submitting to their API - at least that's what they said, maybe they're being clever too.

Anyway, if we read "[)>" with something in the middle "06" we can assume that whatever is in the middle there is the RS character, and the GS character is -1. Unless, of course, it's just the two ASCII characters '3' and '0' for the RS and '2', '9' for GS, which TWO of my Bluetooth wedges return. I'm not sure how we confidently identify each occurrence of the sequence 29 as a GS character without screwing up some part numbers. Then there's Mouser, who start with '>[)>06' followed by the GS. Them, we can identify. Then we can yank all the fields and see what we recognize.

iromero91 commented 4 years ago

I'd like to put to your consideration blabel It seems to be quite simple and easy to define the labels using (jinja templated) HTML and CSS, plus has built in support for barcodes. Not affiliated with the project, just found it the other day and it seems pretty neat for printing inventory labels on a thermal printer.

SchrodingersGat commented 4 years ago

@iromero91 it looks like a very nice option! Great find! Would work well for tape printers for sure!

cdplayer commented 4 years ago

Also notable items regarding label/labelling https://github.com/michaelrsweet/lprint https://github.com/jimevins/glabels-qt

bobek commented 3 years ago

I would like to redo inventory management for my home lab and came across InvenTree. Awesome project! It seems that people are primarily using thermal printers for printing labels. Do you have any concerns about the longevity of the produced labels?

I was rather thinking about using laser printer and Avery-style sheets, which will require a support for printing on partial sheets. Already mentioned glabels supports that. One can do something like the following, the -f sets which label should be considered as a first one on the sheet:

glabels-3-batch -l -C -f 7 -i /dev/stdin -o /dev/stdout label_template.glabels < data.csv > labels.pdf

Have there been any recent thoughts about integration of such workflow (e.g. is this even something you would consider)?

matmair commented 3 years ago

@bobek we currently do not support this setup fully (selection of the space to start printing is not supported).

Seems to be not to hard to get this into the reports-framework. The settings for something like this would be a lot of work for sure. I do not know of anybody working on something like this right now. We appreciate PRs and are ready to help ;-).

SchrodingersGat commented 3 years ago

@bobek I did have a look at supporting avery style labels a while ago, but it's not my personal workflow to use those so never implemented it. glabels did look pretty good, though...

I'd recommend you open a new issue / pull request for this rather than continuing the discussion here.

SchrodingersGat commented 2 years ago

Closing out this issue as it turned into an open-ended discussion. Support for label printing and barcodes has come a long way since this discussion.

inventree / InvenTree

Barcodes - again. #853

Existing labeling on distributor packaging of electronic parts

2D barcodes used

Single 1D code

Multiple 1D codes

DataMatrix is awesome

SO, what to do?

Format 06 field codes

Proposed Path

Inventree QR codes

First Steps

Use Cases