μ£Όλ‘ AI λͺ¨λΈμ ꡬ쑰λ μκ³ λ¦¬μ¦μ μ§μ€νκΈ° μ½μ§λ§, μ€λ¬΄μμλ λ°μ΄ν°μ νμ§μ΄ λͺ¨λΈ μ±λ₯λ§νΌ μ€μν©λλ€. λ³Έ λνμμλ Data-Centric AI μ κ·Ό λ°©μμ ν΅ν΄, λ€κ΅μ΄(μ€κ΅μ΄, μΌλ³Έμ΄, νκ΅μ΄, λ² νΈλ¨μ΄) μμμ¦ μ΄λ―Έμ§μμ κΈμλ₯Ό κ²μΆνλ OCR κ³Όμ λ₯Ό μννκ³ μ ν©λλ€.
Goal : μ°λ κΈ° κ°μ²΄λ₯Ό νμ§νλ λͺ¨λΈμ κ°λ°νμ¬ μ νν λΆλ¦¬μκ±°μ νκ²½ 보νΈλ₯Ό μ§μ
Data : UFO ν¬λ§·μ κΈμκ° ν¬ν¨λ JPG μ΄λ―Έμ§ (Train Data μ΄ 400μ₯, Test Data μ΄ 120μ₯)
Metric : DetEval(Final Precision, Final Recall, Final F1-Score)
μ΄κΈ° λ¨κ³μμλ EDAμ λ² μ΄μ€λΌμΈ μ½λ λΆμμ ν΅ν΄ λ°μ΄ν°μ λͺ¨λΈμ λν κΈ°μ΄μ μΈ λΆμμ μ§νν ν, μΈλΆ λ° ν©μ± λ°μ΄ν°λ₯Ό νμ©νκ³ λ°μ΄ν° ν΄λ μ§κ³Ό μ¦κ° κΈ°λ²μ μ μ©ν λ€μν μ€νμ ν΅ν΄ λͺ¨λΈμ μΌλ°ν μ±λ₯μ μ΅μ ννμμ΅λλ€. μ΅μ’
μ μΌλ‘λ 5-fold μμλΈμ μ μ©νμ¬ μ΅μ μ μ±λ₯μ λμΆνμμ΅λλ€.
κ²°κ³Όμ μΌλ‘ precision:0.9427, recall:0.8801, f1:0.9103λ₯Ό λ¬μ±νμ¬ λ¦¬λ보λμμ 4μλ₯Ό κΈ°λ‘νμμ΅λλ€.
λ² μ΄μ€λΌμΈ λͺ¨λΈμ EAST (An Efficient and Accurate Scene Text Detector; Zhou et al., 2017)μ΄κ³ , Backboneλ‘λ ImageNetμ μ¬μ νλ ¨λ VGG-16 (Visual Geometry Group - 16 layers; Simonyan and Zisserman, 2015)μ μ¬μ©ν©λλ€.
dataset
βββ chinese_receipt
βββ img # train λ° test image
βββ ufo # train λ° test imageμ λν annotation file (ufo format)
βββ japanese_receipt
βββ img # train λ° test image
βββ ufo # train λ° test imageμ λν annotation file (ufo format)
βββ thai_receipt
βββ img # train λ° test image
βββ ufo # train λ° test imageμ λν annotation file (ufo format)
βββ vietnamese_receipt
βββ img # train λ° test image
βββ ufo # train λ° test imageμ λν annotation file (ufo format)
cd code # code ν΄λλ‘ μ΄λ
python train.py # λͺ¨λΈ νμ΅ μ€ν
python validate.py # νμ΅λ κ°μ€μΉλ₯Ό λΆλ¬μ validation μν
python test.py # κ°μ₯ λμ validation μ μλ₯Ό κΈ°λ‘ν κ°μ€μΉλ₯Ό λΆλ¬μ test λ°μ΄ν°μ
μ λν μΆλ‘ μν
βββ .github
βββ external-data
βββ cord-data
βββ synthetic-data
βββ code
βββ model code
βββ README.md
This project uses the CORD (Consolidated OCR Dataset). The dataset is provided under the CORD license terms, and we adhere to these terms within this repository.
For full details on the CORD license and permissions, please refer to the official CORD documentation.
System Information | Tools and Libraries | ||
---|---|---|---|
Category | Details | Category | Details |
Operating System | Linux 5.4.0 | Git | 2.25.1 |
Python | 3.10.13 | Conda | 23.9.0 |
GPU | Tesla V100-SXM2-32GB | Tmux | 3.0a |
CUDA | 12.2 |
Β© 2024 LuckyVicky Team.
Supported by Naver BoostCamp AI Tech.