Note : It's not the final version code. I will the refine and update the code.
Models detection speech bubble in webtoons or cartoons. I have referenced and implemented pytorch-YOLOv4 to detect speech bubble. The key point for improving performance is data analysis. In the case of speech bubbles, there are various forms. Therefore, I define the form of speech bubbles and present the results of training by considering the distribution of data.
Key standard for Data Definition: Shape, Color, Form
standard
shape : Ellipse(tawon), Thorn(gasi), Sea_urchin(seonggye), Rectangle(sagak), Cloud(gurm)
Color : Black/white(bw), Colorful(color), Transparency(tran), Gradation
Form : Basic, Double Speech bubble, Multi-External, Scatter-type
example image
In this project, two categories are applied, shape and color, and form and Gradation are classified as ect.
This class is not about detection, but about speech bubble data distribution.
Pytorch Version
Install Dependencies Code
pip install onnxruntime numpy torch tensorboardX scikit_image tqdm easydict Pillow skimage opencv_python pycocotools
or
pip install -r requirements.txt
Model | Link |
---|---|
YOLOv4 | Link |
YOLOv4-bubble | Link |
1. Download weight
2. Train
python train.py -g gpu_id -classes number of classes -dir 'data_dir' -pretrained 'pretrained_model.pth'
or
Train.sh
3. Config setting
cfg.py
cfg/yolov4.cfg
If you want to train custom dataset, use the information above.
python demp.py -cfgfile cfgfile -weightfile pretrained_model.pth -imgfile image_dir
./cfg/yolov4.cfg
tawon_bw | tawon_color | tawon_Transparency | gasi_bw | gasi_color | gasi_Transparency | seonggye_bw | seonggye_color | seonggye_Transparency | sagak_bw | sagak_color | sagak_Transparency | gurm_bw | gurm_color | gurm_Transparency | total |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
116 | 70 | 68 | 65 | 29 | 59 | 51 | 43 | 44 | 42 | 33 | 69 | 47 | 2 | 12 | 750 |