Closed Superfloh closed 4 years ago
Web demo uses latest release ( https://github.com/KichangKim/DeepDanbooru/releases/tag/v3-20200915-sgd-e30 ).
"Use Cropping" is simple, it splits image into multiple small parts with overlap and independently estimates for each parts. Then it combine all estimated tags with filtering (remove mis-estimated tags due to splitting)
Sadly I'm getting different results with the v3 model and the web version, usually very similar tags but different scores. (not using the cropping option) The "Use Cropping" idea is pretty interesting, would you mind releasing the code for that ? ^^
Also on an unrelated sidenote, the requirements.txt still has tensorflow>=2.1.0.
Model v3 result:
Web result:
For the historical reason, web demo and training program use different image pre-processing steps so it makes slightly different results. I do not have any plan to release full web demo code yet, but here are parts of its image pre-processing and cropping code. You can use this for generating the same result to web's.
# image_utility.py
import math
import skimage.transform
import numpy as np
import tensorflow as tf
def calculate_image_scale(source_width, source_height, target_width, target_height):
"""
Calculate scale for image resizing while preserving aspect ratio.
"""
if source_width == target_width and source_height == target_height:
return 1.0
source_ratio = source_width / source_height
target_ratio = target_width / target_height
if target_ratio < source_ratio:
scale = target_width / source_width
else:
scale = target_height / source_height
return scale
def transform_and_pad_image(image, target_width, target_height, scale=None, rotation=None, shift=None, order=1, mode='edge'):
"""
Transform image and pad by edge pixles.
"""
image_width = image.shape[1]
image_height = image.shape[0]
image_array = image
# centerize
t = skimage.transform.AffineTransform(
translation=(-image_width * 0.5, -image_height * 0.5))
if scale:
t += skimage.transform.AffineTransform(scale=(scale, scale))
if rotation:
radian = (rotation / 180.0) * math.pi
t += skimage.transform.AffineTransform(rotation=radian)
t += skimage.transform.AffineTransform(
translation=(target_width * 0.5, target_height * 0.5))
if shift:
t += skimage.transform.AffineTransform(
translation=(target_width * shift[0], target_height * shift[1]))
warp_shape = (target_height, target_width)
image_array = skimage.transform.warp(
image_array, (t).inverse, output_shape=warp_shape, order=order, mode=mode)
return image_array
def crop_image(image, crop_box_ratio):
width = image.shape[1]
height = image.shape[0]
(left_ratio, upper_ratio, right_ratio, lower_ratio) = crop_box_ratio
width_start = int(width * left_ratio)
width_end = int(width * right_ratio)
height_start = int(height * upper_ratio)
height_end = int(height * lower_ratio)
return image[height_start:height_end, width_start:width_end, :]
def create_crop_box_ratio_list(ratio):
return [
(0, 0, ratio, ratio),
(1 - ratio, 0, 1, ratio),
(0, 1 - ratio, ratio, 1),
(1 - ratio, 1 - ratio, 1, 1),
((1 - ratio) * 0.5,
(1 - ratio) * 0.5, (1 + ratio) * 0.5, (1 + ratio) * 0.5)
]
def transform_image(image, width, height):
source_height = image.shape[0]
source_width = image.shape[1]
scale = calculate_image_scale(source_width, source_height, width, height)
image = transform_and_pad_image(image, width, height, scale=scale)
return image / 255.0
def load_image(path):
image_raw = tf.io.read_file(path)
image = tf.io.decode_png(image_raw, channels=3)
return image.numpy().astype(np.float32)
def resize_image(image, size):
return tf.image.resize(image, size=size, method=tf.image.ResizeMethod.AREA, preserve_aspect_ratio=True).numpy()
The core method is transform_image()
. Also here are cropping code:
y = model.predict(image_transformed)[0]
if crop == 'true':
crop_box_ratio_list = image_utility.create_crop_box_ratio_list(0.6)
for crop_box_ratio in crop_box_ratio_list:
image_crop = image_utility.crop_image(image, crop_box_ratio)
image_crop = image_utility.transform_image(
image_crop, image_width, image_height)
image_crop = image_crop.reshape(
(1, image_crop.shape[0], image_crop.shape[1], image_crop.shape[2]))
y_crop = model.predict(image_crop)[0]
y_crop = np.multiply(
y_crop, project_data['crop_exclude_tags_vector'])
y = np.maximum(y, y_crop)
Thank you very much, loading and pre-processing the image with that code indeed gives the same result as on the webpage.
For the cropping I'm missing the project_data['crop_exclude_tags_vector']
, it doesn't exist in the project.json.
It is simple mask-vector (0 or 1). If the tag exists in exclude_tags, its value is 0, or not, 1.
Here is my exclude_tags:
1boy
2boys
3boys
4boys
5boys
6+boys
1girl
2girls
3girls
4girls
5girls
6+girls
1koma
2koma
3koma
4koma
5koma
solo
solo_focus
text_focus
ass_focus
male_focus
out-of-frame_censoring
out_of_frame
feet_out_of_frame
head_out_of_frame
lower_body
upper_body
portrait
close-up
rating:safe
rating:questionable
rating:explicit
score:very_bad
score:bad
score:average
score:good
score:very_good
I made a vector out of the tags mentioned above and I'm getting the same result as the web version now.
In case someone else is interested in the cropping feature, here is my code:
project_context, model, tags = dd.project.load_project(project_path)
width = model.input_shape[2]
height = model.input_shape[1]
try:
image = load_image(image_path)
image_transformed = transform_image(image, width=width, height=height)
except:
print("error loading the image")
continue
image_shape = image_transformed.shape
image_transformed = image_transformed.reshape((1, image_shape[0], image_shape[1], image_shape[2]))
y = model.predict(image_transformed)[0]
if crop == 'true':
crop_box_ratio_list = create_crop_box_ratio_list(0.6)
for crop_box_ratio in crop_box_ratio_list:
image_crop = crop_image(image, crop_box_ratio)
image_crop = transform_image(image_crop, width=width, height=height)
image_crop = image_crop.reshape(
(1, image_crop.shape[0], image_crop.shape[1], image_crop.shape[2]))
y_crop = model.predict(image_crop)[0]
exclude_tags = np.fromfile(project + "/exclude_tags.txt", dtype=int, sep='\n')
y_crop = np.multiply(y_crop, exclude_tags)
y = np.maximum(y, y_crop)
And here the Vector exclude_tags.txt:
Thank you for your help.
@rachmadaniHaryono
Hi,
just a short question, what model version does the web version use, and what do you do in case the "Use Cropping" option is enabled? That option works well with manga pages, so I'm interested in how it works.