QData / C-Tran

General Multi-label Image Classification with Transformers
MIT License
244 stars 43 forks source link

Using CTran for Barcode Value Prediction #28

Open Naumann07 opened 4 months ago

Naumann07 commented 4 months ago

Hey Jack,

I hope you're well. I'm working on a project to predict values from barcode images using your CTran model and would appreciate your guidance on its suitability. Here are the specifics of mytask:

Task Description:

Objective: Predict characters encoded in barcode images. Image Type: Grayscale images containing barcode patterns channel 1(they are online generated images using python barcode library of CODE128B). Labels: The characters represented by the bars in the barcodes, with a total of 108 possible characters. Character Independence: Each bar represents a different character, and the characters are independent of each other. No Masks: Our dataset does not include masks.

Given this setup, I have a questions: Are there any specific modifications or considerations needed to adapt CTran for our barcode dataset? Thank you for your time and assistance. I look forward to your guidance.

Best regards,

jacklanchantin commented 4 months ago

Hi @Naumann07 thanks for reaching out. C-Tran isn't an explicit structured output model, so it may not work super well for your task. For example, if the code is ABBC, it would only be able to tag the characters "A", "B", and "C", and not count the number of "B"s or the ordering of them. Not sure if this still suits your needs.