snoop2head / DotNeuralNet

👁️ Light-weight Neural Network for Optical Braille Recognition in the wild & on the book
https://huggingface.co/spaces/snoop2head/braille-detection
MIT License
12 stars 2 forks source link

question about dataset #1

Open axeasy opened 1 month ago

axeasy commented 1 month ago

Hello, I downloaded the corresponding dataset according to the readme, but encountered an issue while running the train.exe file. Could you please provide specific files that need to be stored in the dataset

snoop2head commented 1 month ago

@axeasy Can you please specify the error? Let's begin with which file you ran in the beginning. train.exe does not exist in this repo btw

axeasy commented 1 month ago

Sorry, I typed the wrong word. It's a train.py file. The first issue I encountered was that I saw this file pointing to dataset.py, so I downloaded four datasets from the dataset folder and put them in the corresponding folder. However, an error occurred during training because the dataset.py file contained :self. kaggle. path, self. angelina_path, and self. dsbi-path, each pointing to a folder called "cropped image", but none of these datasets were included

snoop2head commented 1 month ago

Oh, I see.

It's a bit of nuisance but necessity for making braille classifier. But if you want to skip the training process, just use yolo models at weights directory which I made detection/classification models public.

What is your purpose of using the project? Training from scratch or building an application based on the available model?

axeasy commented 1 month ago

Thank you for your help.First of all, I have a question. Based on the conversation just now, I want to know whether the images stored in cropped images are cropped from their parent dataset. For example, should 'dataset/dsbi/cropped images' contain braille squares cropped from any image in the DSBI dataset? Secondly, I want to learn and explore the process of Braille image recognition through this project and use it as a basis to complete my graduation thesis.So I want to start from the training process until the recognition is completed

axeasy commented 1 month ago

I put the cropped images from ‘a’ to ‘z’ under the ‘cropped images’ folder in the angelinadataset folder and named them in the format of ‘010000’, and put the cropped images from ‘a’ to ‘z’ under the ‘cropped images’ folder in the ‘DSBI’ and ‘kaggle’ folders, and named them in the format of ‘a,b,c,…,z’. Now I can start training, but I want to know if I train in this way, can a dataset only have 64 images at most, because there are only 64 types of Braille.

snoop2head commented 1 month ago

@axeasy Sorry, I've been away because of rebuttal for other conferences.

The project's purpose was to create a model robust to both book background and natural background which coerced me to create multi-label classification model. However, to the best of my knowledge, braille_natural dataset only has region of interest (RoI) positions but does not provide the specific braille label. In contrast, DSBI or AngelinaDataset contain both RoI bounding box and human labeled classes. So the goal of creating multi-label classification model is to recognize 2^6 types of braille and make the model to function as a pseudo-labeler in natural scene background braille_natural dataset.

After pseudo-labeling with the created model, you can move on to creating a detection model which is documented at yolov8 documentation. But you may skip the pseudo-labeling part if your model is going to receive braille inputs from books only.

axeasy commented 2 weeks ago

I'm sorry, I haven't been working recently because of the holiday. Can we add a contact method on other platforms to communicate?

axeasy commented 1 day ago

I would like to ask if the train.py file in your project is used for training target detection.