lltcggie / waifu2x-caffe

waifu2xのCaffe版
MIT License
8.04k stars 839 forks source link

How to translate xxx.json to xxx.json.caffemodel? #115

Closed shampron closed 5 years ago

shampron commented 6 years ago

Hello I'm a new guy to deeping learning and very interested in waifu2x. I have trained my own model in Ubuntu and I want to use it in waifu2x-caffe. But I found the models in caffe are deferent in Linux, so I want to konw how to do. please. thanks.

nagadomi commented 6 years ago

It was manually translated from Torch model to caffe.prototxt. I have released caffe.prototxt at https://github.com/nagadomi/waifu2x/tree/dev/appendix/caffe_prototxt

JSON file can be generated with tools/export_model.lua See also https://github.com/nagadomi/waifu2x/blob/dev/tools/export_all.sh

And waifu2x-caffe supports vgg_7(for denosing) and upconv_7(for upscaling/upscaling+denosing). You can replace the json file in bin/models of waifu2x-caffe. https://github.com/lltcggie/waifu2x-caffe/tree/master/bin/models .caffemodel is automatically generated from .prototxt and .json when waifu2x-caffe is launched. When replacing json files, delete .caffemodel before launching waifu2x-caffe.

BlueflamesX commented 6 years ago

What's the difference between this Waifu2x and the original one? https://github.com/nagadomi/waifu2x

nagadomi commented 6 years ago

waifu2x-caffe is a windows port of the original waifu2x.

BlueflamesX commented 6 years ago

What is original W2X for? Is caffe up to date?

2ji3150 commented 6 years ago

The original waifu2x is written in lua language and designed for web service. The models are also from it.

morozovsk commented 6 years ago

.caffemodel is automatically generated from .prototxt and .json when waifu2x-caffe is launched. When replacing json files, delete .caffemodel before launching waifu2x-caffe.

But I can't launch it because it's for windows and I don't have windows. @nagadomi I'd like to use caffe model from opencv with dnn module: cv2.dnn.readNetFromCaffe(prototxt, caffemodel) I understand it's my own problem but maybe you have prototxt and caffemodel files?

PS: I found scale2.0x_model.prototxt and scale2.0x_model.json.caffemodel but opencv doen't work:

OpenCV Error: Unspecified error (Can't create layer "target" of type "MemoryData") in getLayerInstance, file /io/opencv/modules/dnn/src/dnn.cpp, line 257

May be it's problem of opencv.

nagadomi commented 6 years ago

The last two layer blocks target andloss can be removed from prototxt. They are not used.

morozovsk commented 6 years ago

@nagadomi thank you very mutch. So I wrote code on python:

import numpy as np
import cv2

net = cv2.dnn.readNetFromCaffe("scale2.0x_model.prototxt", "scale2.0x_model.json.caffemodel")

image = cv2.imread('icon_64x64.png')
print(image.shape) # (64, 64, 3)

blob = cv2.dnn.blobFromImage(image)

net.setInput(blob)
newBlob = net.forward()
print(newBlob.shape) # (1, 3, 100, 100) # (1, channels, x, y)

newBlob = newBlob.reshape([3, 100, 100])
print(newBlob.shape) # (channels, x, y)

newBlob = np.transpose(newBlob, (1, 2, 0))
print(newBlob.shape) # (x, y, channels)

cv2.imwrite('icon_100x100.png', newBlob)

icon_64x64 icon_100x100

I think it's because you have you convert BGR->YUV before and YUV->BGR after.

So I have tried :)

nagadomi commented 6 years ago

With upconv_7 model, the input image format is 0.0-1.0 scaled RGB float (not 0-255 scaled CV_8UC3 BGR).

To upscaling 2x exactly, it is necessary to add 7px padding to the input image (it can be done with cv2.copyMakeBorder with BORDER_REPLICATE). When enlarging a large image, it is necessary to split and convert because of GPU memory usage.

morozovsk commented 6 years ago

@nagadomi thank you very mutch. So my code on python now is:

import numpy as np
import cv2

net = cv2.dnn.readNetFromCaffe("scale2.0x_model.prototxt", "scale2.0x_model.json.caffemodel")

image = cv2.imread('icon_64x64.png')
image = cv2.copyMakeBorder(image, 7, 7, 7, 7, 1)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
print(image.shape) # (64, 64, 3)

blob = cv2.dnn.blobFromImage(image)
print(image.shape)
blob = blob / 255

net.setInput(blob)
newBlob = net.forward()
print(newBlob.shape) # (1, 3, 100, 100) # (1, channels, x, y)

newBlob = newBlob.reshape([newBlob.shape[1], newBlob.shape[2], newBlob.shape[3]])
print(newBlob.shape) # (channels, x, y)

newBlob = np.transpose(newBlob, (1, 2, 0))
print(newBlob.shape) # (x, y, channels)
newBlob = newBlob * 255

cv2.imwrite('icon_128x128.png', newBlob)

icon_100x100

Also I implemented it on php.

FantasyJXF commented 4 years ago

@morozovsk You should convert the final bewBlob to BGR to get the image required.

It's very nice for you to share your thought to use waifu2x with OpenCV dnn.

FantasyJXF commented 4 years ago

@morozovsk @nagadomi Does the OpenCV dnn has a memory limit? When I use the above code on a large image has size 916x636, it run into segmentation fault. miku_4x_wf

Looking forawrd to your reply.

nagadomi commented 4 years ago

@FantasyJXF waifu2x (and waifu2x-caffe) uses an algorithm that can process large images with less GPU memory. It splits the image into smaller blocks for processing, and finally combines them. It's like Tiled rendering in 3DCG.

FantasyJXF commented 4 years ago

@nagadomi Thanks to your reply.

It's really tricky to use a Divide and Conquer way to deal with large size image, would you please tell me what algorithm waifu2x and waifu2x-caffe use to do this? Because I think if I combine the upscaled images together, there might be incoherence at the connection part, like a line or sth. One way to solve this problem might be make the devided blocks has overlaps, and remove them when combine. But since we have padded the image during preprocess, this might not be a problem.

nagadomi commented 4 years ago

@FantasyJXF upconv_7 model(used in the code above) requires 7px padding for each input block. Output block overlap is not required. The padding size depends on the model and conversion mode.

model mode input padding size
upconv_7 upscaling 7
cunet upscaling 18
cunet denoising 28

block size for cunet must be a multiple of four.