snoop2head / DotNeuralNet

👁️ Light-weight Neural Network for Optical Braille Recognition in the wild
https://huggingface.co/spaces/snoop2head/braille-detection
MIT License
14 stars 2 forks source link
braille object-detection optical-braille-recognition

DotNeuralNet

Streamlit Demo App

Light-weight Neural Network for Optical Braille Recognition in the wild & on the book.

Repository Structure

DotNeuralNet
ㄴ assets - example images and train/val logs
ㄴ dataset
  ㄴ AngelinaDataset - book background
  ㄴ braille_natural - natural scene background
  ㄴ DSBI - book background
  ㄴ KaggleDataset - arbitrary 6 dots
  ㄴ yolo.yaml - yolo dataset config
ㄴ src
  ㄴ utils
    ㄴ angelina_utils.py
    ㄴ braille_natural_utils.py
    ㄴ dsbi_utils.py
    ㄴ kaggle_utils.py
  ㄴ crop_bbox.py
  ㄴ dataset.py
  ㄴ model.py
  ㄴ pseudo_label.py
  ㄴ train.py
  ㄴ visualize.py
ㄴ weights
  ㄴ yolov5_braille.pt # yolov5-m checkpoint
  ㄴ yolov8_braille.pt # yolov8-m checkpoint

Result

Logs

Installation

CV2 and Yolo Dependency Installation

apt-get update && apt-get install ffmpeg libsm6 libxext6  -y
git clone https://github.com/ultralytics/yolov5  # clone
cd yolov5
pip install -r requirements.txt  # install

How to Run

import PIL
from ultralytics import YOLO
from convert import convert_to_braille_unicode, parse_xywh_and_class

def load_model(model_path):
    """load model from path"""
    model = YOLO(model_path)
    return model

def load_image(image_path):
    """load image from path"""
    image = PIL.Image.open(image_path)
    return image

# constants
CONF = 0.15 # or other desirable confidence threshold level
MODEL_PATH = "./weights/yolov8_braille.pt"
IMAGE_PATH = "./assets/alpha-numeric.jpeg"

# receiving results from the model
image = load_image(IMAGE_PATH)
model = YOLO(MODEL_PATH)
res = model.predict(image, save=True, save_txt=True, exist_ok=True, conf=CONF)
boxes = res[0].boxes  # first image
list_boxes = parse_xywh_and_class(boxes)

result = ""
for box_line in list_boxes:
    str_left_to_right = ""
    box_classes = box_line[:, -1]
    for each_class in box_classes:
        str_left_to_right += convert_to_braille_unicode(model.names[int(each_class)])
    result += str_left_to_right + "\n"

print(result)
"""
⠁⠃⠉⠋⠙⠑⠙⠋⠛⠓⠊⠑
⠓⠇⠇⠍⠝⠕⠏⠟⠗
⠎⠞⠥⠼⠗⠭⠵
⠼⠧⠚⠁⠃⠉⠙⠑⠙⠛⠚⠊⠑
"""

Citation

If you find DotNeuralNet useful for your research, please consider citing the repository:

@misc{ahn2023dotneuralnet,
  author={Ahn, Young Jin},
  title={DotNeuralNet: Light-weight Neural Network for Optical Braille Recognition in the Wild},
  year={2023},
}