NHMDenmark / Pinned-Insects-workstation

Work on the Pinned Insects workstation and workflow for mass digitisation in Denmark (DaSSCo)
0 stars 0 forks source link

Requirements for barcodes #40

Closed PipBrewer closed 12 months ago

PipBrewer commented 1 year ago

Work with Laurits/z to establish if specifications of collections managers and curators for barcodes also work for digitisation.

Requirements:

PipBrewer commented 1 year ago

@chelseagraham and @lauritsf Please could you feedback and document directly into this ticket?

lauritsf commented 1 year ago

I have reviewed the details mentioned in this GitHub issue and discussed the constraints we need to consider together with @Andersillum.

Label/tag size:

To ensure the specimens fit back into their boxes, it's important to keep the label/tag size within these dimensions. Some specimens are very small, and larger tags/labels would pose difficulties.

Datamatrix size:

We should aim to make the datamatrix as large as possible without compromising the space needed for the "silent zone." Rectangular datamatrices are also an option, but we must strike a balance with the accompanying text.

Border around the label? No border, unless for printing purposes.

Including a border would consume valuable space without offering noticeable benefits. However, when it comes to printing, having a border between labels can be useful for visual separation.

Text:

The text on the label should be legible in small prints and suitable for OCR. It's preferable to use an open/free font like Inconsolata, although other fonts are also acceptable. Keep in mind that system fonts like Arial may not be open/free, but users with Microsoft/Adobe software typically have licenses. With this font size, we can comfortably fit 2 or 3 lines while maintaining readability. The spacing of 1.3mm from the side provides enough room for the pin dot.

Pin dot:

The pin dot serves the purpose of enabling easy piercing of the label with a needle. The placement has been copied from the details provided in this GitHub issue.

example_front example_back

Andersillum commented 1 year ago

It looks good :) Can you try making the text bold? or does it get to big then?

Which kind of data is in the datamatrix? Can we get NHMD1234567 (or how many digits is needed). It would make the datamatrices much more useful for scientific studies (in excel for example), registration etc. with a barcode scanner if there is NHMD in it. I know Specify 6 ignores the leading NHMD and only searches for the catalog number when looking up specimens, I hope specify 7 does the same.

Border around the label? No border, unless for printing purposes.

Including a border would consume valuable space without offering noticeable benefits. However, when it comes to printing, having a border between labels can be useful for visual separation.

The main purposes for the border is for for cutting.

lauritsf commented 1 year ago

Can you try making the text bold? or does it get to big then?

I will play around with the text and find good balance :)

Which kind of data is in the datamatrix? Can we get NHMD1234567

As we discussed yesterday Specify 7 will not ignore the characters anymore, and will now be up to 9 digits, with leading zeros. I have now been informed that they should contain 9 digits (including leading zeros, eg. 003456789), but I will also be testing including NHMD (eg. NHMD003456789).

The main purposes for the border is for for cutting.

That is what I should have written ;)

chelseagraham commented 1 year ago

In our planning meeting this AM; Pip, Matilde and I discussed that Laurits is making two versions of the labels with the data encoded as he described in the comment above.

@Gomismis or I will take at least 100 images of each of the kinds of labels on the draft stage for Laurits to run though a software to detect the data matrices. One of us will also test the barcodes by scanning them into Excel and note the amount of time it takes to scan them in successfully. (I don't anticipate any errors in the numbers read, just in the amount of attempts to successfully scan the barcodes with NHMD012345678 encoded).

lauritsf commented 1 year ago

Text: (Update)

With 9 digits, the numbers might be slightly too long with this font size.

Printing: Printing the sides one by one with manual feeds seems to decrease the displacement between labels on the front and back. It should be noted, that it is not entirely robust, but I was able to get relatively good results first try with the cardstock.

Top Bottom
Image 1 Image 2
chelseagraham commented 1 year ago

Super! Thank you for making the labels with landscape orientation, Laurits!

lauritsf commented 1 year ago

I just thought, however, the text ought to be written in the direction away from the needle, however. Ideally, there should not be a "front" and "back". It should just be the same.

lauritsf commented 1 year ago

Following the discussions with @Andersillum, @PipBrewer, and @chelseagraham, I have generated labels with the following specifications:

For review, I have included the label data and the generated labels in the links below:

chelseagraham commented 1 year ago

@Gomismis and I have tested the labels Laurits created. The barcode labels scanned without any errors or delays. All barcodes were also photographed.

The files are at the following location: N:/SCI-SNM-DigitalCollections/DaSSCo/Pilot Data/Pinned Insects Working Data/BarcodeLabelTesting_LauritsModSetup :D

lauritsf commented 1 year ago

Label Photos and Sets

For each label in NHMD123456789.pdf and 123456789.pdf, there is a top view photo of a specimen with the label to the side.

In total, there were 2 sets of 100 labels each.

Example Pictures

Example picture Example picture
Example crop Example crop

Label Visibility and Orientation

In all of the pictures, the label is in clear view and in focus. In some cases, the label has been rotated.

Decoding Process

In all of the pictures, I was able to decode the datamatrix, and there was a 1:1 correspondence with the expected values. However, to actually detect and decode the datamatrices, it was necessary to reduce the resolution of the pictures and crop the general area where the datamatrix is expected. This is not due to bad image quality but instead a limitation with the decoders.

zxing-cpp was used as a first-line decoder, and pylibdmtx was used where the first failed to decode.

Conclusion and Next Steps

We can (carefully) conclude that the label dimensions are sufficient for this specific setup, although more work is needed to investigate other angles and perspectives/focal lengths, as well as the post-processing/decoding.


The code for this testing is found at: N:/SCI-SNM-DigitalCollections/DaSSCo/Pilot Data/Pinned Insects Working Data/BarcodeLabelTesting_LauritsModSetup/code-directory

chelseagraham commented 1 year ago

This is great and this is with labels printed on a general KU printer.

We will pursue a higher resolution printer, but it does not seem like a priority for testing the data matrix detection.

Andersillum commented 1 year ago

@lauritsf It looks really good. Should and could we fix one of the Raspberry Pi and try with pictures of the datamatrix from a Pi camera?

lauritsf commented 1 year ago

@Andersillum I will be working on getting the raspberry pi cameras up and running as well this or the coming week. I have identified identified a few inconsistencies with the NHMDenmark/Pi-Eye repo, which is causing at least one of the problems we are experiencing. Then we could look into expanding to angled shots with the Raspberry Pi cameras.

chelseagraham commented 1 year ago

@Andersillum and I discussed how Specify may dictate label requirements with @FedorSteeman and determined the following: Labels should NOT have NHMD encoded in the data matrix Labels should contain 9 digits (including leading zeros) encoded in the data matrix

Anders proposes to have labels display numbers without leading zeros (while still maintaining numbers of a full 9 digit length encoded in the data matrix).

chelseagraham commented 1 year ago

I met with Simon from NHMA to discuss their labels today.

The most important thing for them is to maintain the size of the label (19 mm x 14 mm) and the placement of the pinhole dot.

They prefer the orientation of the text on the label as presented, but could drop 'ENTOMOLOGY' from the label and do not care about the number being bold. All text can be the same style and size. The 9 digit catalogue number being encoded in the data matrix is ok with them. They do not have a preference if the label displays the number with or without leading zeros. They would like a double sided label.

chelseagraham commented 1 year ago

I met with Simon from NHMA to discuss their labels today.

The most important thing for them is to maintain the size of the label (19 mm x 14 mm) and the placement of the pinhole dot.

They prefer the orientation of the text on the label as presented, but could drop 'ENTOMOLOGY' from the label and do not care about the number being bold. All text can be the same style and size. The 9 digit catalogue number being encoded in the data matrix is ok with them. They do not have a preference if the label displays the number with or without leading zeros. They would like a double sided label.

A couple of quick mock-ups: NHMA_DMex

Their current labels as provided by Simon: Insekt (Etiket) - Simon.pdf

chelseagraham commented 1 year ago

Emailed NHMA (Thomas, Hans, Simon) to confirm that encoding solely the nine digit number is ok with the institutional numbering system on 1 August to inquire.

chelseagraham commented 1 year ago

NHMA would like to have the opportunity to see barcode labels before they are finalized

chelseagraham commented 1 year ago

Hans response to the inquiry:

SV: Insect barcode labels.pdf

Like Simon answered, all the below sounds good to me. Except maybe if we could have labels without leading zeroes in the catalog number. We don’t have leading zeroes in any of the existing collections or labels.

I think I have told Pip before, but we do have several collections starting with CatNo 1. We have added acronyms to avoid errors. Eg NHMA ENT 1 for entomology collections CatNo 1, NHMA VERT 1 for vertebrate collections CatNo 1 etc. <<

Clarifying with him if he means that they would prefer excluding leading zeroes written on the label but it is ok embedded in the data matrix. Copying Pip.

There is confirmation on the repeat of the catalog numbers in different divisions.

lauritsf commented 1 year ago

NMHD Text Justification for Pinned Insect Barcode Labels

Version Image
Left-Right Version: label_left_right
Right Adjusted Version: label_right_adjusted
Left Adjusted Version: label_left_adjusted

Consulted:

Decision:

The team decided on the right adjusted version of the label template.

Laurits Fredsgaard Larsen's Comments:

NHMA Design Preference for Double Sided Labels

Hans, Simon, and Thomas from NHMA were presented with 3 options for double-sided labels:

Option A Option B Option C
NHMA_label_A.pdf NHMA_label_B.pdf NHMA_label_C.pdf

NHMA's Choice: Out of the 3 options, Option B closely resembles their existing design and is their preferred choice.

lauritsf commented 1 year ago

NHMD Label Specification:

  1. Dimensions:

    • Width: 12mm
    • Height: 5mm
  2. Pin Dot:

    • Radius: 0.25mm
    • Position: Offset 0.7mm from the left side, centered vertically.
  3. Text:

    • Two lines: "NHMD" and a number (up to 9 digits without leading zeros).
    • Font: Inconsolata ExtraBold
    • Font size: 3.55 (maximum size that fits).
    • Alignment: Right justified.
    • Offset: 5mm from the right side to avoid overlap with the datamatrix, but never less than 1.3mm from the left.
  4. Datamatrix:

    • Encodes the number with leading zeros to make 9 digits in total.
    • Size: 5x5mm (inclusive of the silent zone).
    • Placement: On the right side of the label, same height as the label itself.

NHMA Label Specification:

  1. Dimensions:

    • Width: 19mm
    • Height: 14mm
  2. Pin Dot:

    • Radius: 0.25mm
    • Position: Offset 2.8mm from the bottom (0.2 x 14), centered horizontally.
  3. Text:

    • Three lines: "NHMA", a number (without leading zeros, up to 9 digits), and a department name (e.g. ENTOMOLOGY).
    • Font: Inconsolata ExtraBold
    • Font size: 5.
    • Alignment: Centered.
    • Vertical Offset: 0.5mm from the top for the first line and 6.5mm from the bottom for the last line.
  4. Datamatrix:

    • Encodes the number with leading zeros to achieve a total of 9 digits.
    • Size: 6.5x6.5mm.
    • Placement: In the bottom right corner of the label.

See https://github.com/lauritsf/pinned-datamatrix-label-generator generating the labels

PipBrewer commented 1 year ago

Note that NHMA and NHMD Entomology agreed that leading digits encoded into data matrices due to ability to validate the numbers and due to the requirements of Specify 7 (enables exact matching of records). NHMD this was agreed in person. Email regarding decision from NHMA attached here.

Re_ Insect barcode labels.pdf

chelseagraham commented 7 months ago

Initial NHMA Decision emails: RE: Insect barcode labels.pdf