junxnone / aiwiki

AI Wiki
https://junxnone.github.io/aiwiki
18 stars 2 forks source link

ML Tasks Image OCR SW #188

Open junxnone opened 4 years ago

junxnone commented 4 years ago

Sliding Windows CNN CTC

Reference

Brief

Arch

Name Description
Sliding window crop bbox x N
Classification Layer feature --> label
CTC Transcription layer label --> final predict result

Preprocess & Sliding Windows

Preprocess Training Testing
Input scale to Height = 32, Width 按比例缩放 (Pad -> 256)
Window size - Single-scale WxH = 32x32
- Multi-scale WxH = 32x24 - WxH = 32x32 - WxH = 32x40 - > resize to 32x32
32x32
Step Size = 4
Pipeline CNN Layers CTC Algo
image image image
sliding windows examples image

decoding

junxnone commented 4 years ago

junxnone/tech-io#749