This a deep learning project is a first part of application, which can make the characters of balloons in Manga become big or small automatically when people read Manga by phone. In order to implement the application, we plan to use a deep learning model to recognize the parts in Manga like balloons, face and so on. We chosed Manga109 as the dataset and Yolov3 as the deep learning model.
A Keras implementation of YOLOv3 (Tensorflow backend) inspired by allanzelener/YAD2K.
This data set (hereafter referred to as Manga109) has been compiled by the Aizawa Yamasaki Laboratory, Department of Information and Communication Engineering, the Graduate School of Information Science and Technology, the University of Tokyo. The compilation is intended for use in academic research on the media processing of Japanese manga. Manga109 is composed of 109 manga volumes drawn by professional manga artists in Japan.
Generate your own annotation file and class names file.
One row for one image;
Row format: image_file_path box1 box2 ... boxN;
Box format: x_min,y_min,x_max,y_max,class_id (no space).
Use convert.py to convert model The file model_data/yolo_weights.h5 is used to load pretrained weights.
Modify train.py and start training.
python train.py
Use your trained weights or checkpoint weights with command line option --model model_file when using yolo_video.py Remember to modify class path or anchor path, with --classes class_file and --anchors anchor_file.
Use compute_mAP to evaluate the model
-python 3.5.6
-Keras 2.1.5
-tensorflow 1.6.0
Two classes(face and text)
Tested images with groundtruth
@KAORU KAWAKATA
Face AP
Four classes(face text frame body)