titu1994 / Neural-Style-Transfer

Keras Implementation of Neural Style Transfer from the paper "A Neural Algorithm of Artistic Style" (http://arxiv.org/abs/1508.06576) in Keras 2.0+
Apache License 2.0
2.26k stars 481 forks source link

[Question] Content mask format #62

Open mfxuus opened 4 years ago

mfxuus commented 4 years ago

I've been exploring INetwork.py, and the regular neural style transfer works well, but if I add the --content_mask option it results in the following error:

(env) PS E:\2_github_projects\neural_style_transfer\script_helper> python ./Script/INetwork.py "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\content\resized\wechat_laoma.jpg" "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\style\resized\starry_night.jpg" "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\results\wechat_laoma_Masked" --image_size 350 --content_weight 0.025 --style_weight 1.0 --total_variation_weight 8.5E-05 --style_scale 1 --num_iter 10 --rescale_image "False" --rescale_method "bicubic" --maintain_aspect_ratio "True" --content_layer "conv5_2" --init_image "content" --pool_type "max" --preserve_color "False" --min_improvement 0 --model "vgg16" --content_loss_type 0 --content_mask "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\content\masks\wechat_laoma.png"
Using TensorFlow backend.
2019-11-05 01:05:11.388712: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2019-11-05 01:05:13.013542: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-11-05 01:05:13.169021: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 745 major: 5 minor: 0 memoryClockRate(GHz): 1.0325
pciBusID: 0000:01:00.0
2019-11-05 01:05:13.174196: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-11-05 01:05:13.178218: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-05 01:05:13.180694: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-11-05 01:05:13.186239: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 745 major: 5 minor: 0 memoryClockRate(GHz): 1.0325
pciBusID: 0000:01:00.0
2019-11-05 01:05:13.191922: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-11-05 01:05:13.195897: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-05 01:05:14.312514: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-05 01:05:14.316991: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-11-05 01:05:14.319694: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-11-05 01:05:14.323843: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3046 MB memory) -> physical GPU (device: 0, name: GeForce GTX 745, pci bus id: 0000:01:00.0, compute capability: 5.0)
Model loaded.
Traceback (most recent call last):
  File "./Script/INetwork.py", line 504, in <module>
    sl1.append(style_loss(style_reference_features[j], combination_features, style_masks[j], shape))
  File "./Script/INetwork.py", line 418, in style_loss
    content_mask = K.variable(load_mask(content_mask_path, nb_channels))
  File "./Script/INetwork.py", line 269, in load_mask
    mask = imresize(mask, (width, height)).astype('float32')
  File "E:\2_github_projects\neural_style_transfer\script_helper\Script\utils.py", line 34, in imresize
    img = Image.fromarray(img, mode='RGB')
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\PIL\Image.py", line 2666, in fromarray
    return frombuffer(mode, size, obj, "raw", rawmode, 0, 1)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\PIL\Image.py", line 2609, in frombuffer
    return frombytes(mode, size, data, decoder_name, args)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\PIL\Image.py", line 2542, in frombytes
    im.frombytes(data, decoder_name, args)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\PIL\Image.py", line 829, in frombytes
    raise ValueError("not enough image data")
ValueError: not enough image data

The full command is

python ./Script/INetwork.py "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\content\resized\wechat_laoma.jpg" "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\style\resized\starry_night.jpg" "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\results\wechat_laoma_Masked" --image_size 350 --content_weight 0.025 --style_weight 1.0 --total_variation_weight 8.5E-05 --style_scale 1 --num_iter 10 --rescale_image "False" --rescale_method "bicubic" --maintain_aspect_ratio "True" --content_layer "conv5_2" --init_image "content" --pool_type "max" --preserve_color "False" --min_improvement 0 --model "vgg16" --content_loss_type 0 --content_mask "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\content\masks\wechat_laoma.png"

and if I remove the last part (--content_mask), everything runs fine. It is likely something is wrong with my mask image, but I've tried jpg / png / bmp (all using Windows paint / 3d paint). Just wondering if you have any idea what could be the issue? Thanks in advance!

titu1994 commented 4 years ago

PIL is having issues loading your mask image. Try loading the image in a seperate script or the Python interpreter and see if it works. Likely it is an issue with the mask.

mfxuus commented 4 years ago

Hmmm, so it seems that in INetwork.py, line 268 mask = imread(mask_path, mode="L") # Grayscale mask load reads the image into an array, and then on the next line, mask = imresize(mask, (width, height)).astype('float32'), which in turn goes to

if type(img) != Image:
        img = Image.fromarray(img, mode='RGB')

in utils.py, the error is called when forced to use mode RGB on the balck/white image (which has shape of (W, H) and no RGB dimensions). However, if I change this line to allow other mode, some other errors appear later in the program. Therefore the initial question of what kind of image should I pass in as a mask? If it helps, I can try to reproduce the "errors later in the program" and paste it here later today.

titu1994 commented 4 years ago

The mask image must be a binary image, with pixel values either 0 or 1. It can be saved as a png preferably.

Could you also paste the errors that occur. I think I need the refactor my utils.py script to account for masking in mode L

mfxuus commented 4 years ago

Sure. So if I change utils.py at ~line 33 into:

if type(img) != Image:
    try:
        img = Image.fromarray(img, mode='RGB')
    except:
        img = Image.fromarray(img)

I get the following:

(env) PS E:\2_github_projects\neural_style_transfer\script_helper> python ./Script/INetwork.py "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\content\resized\wechat_laoma.jpg" "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\style\resized\starry_night.jpg" "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\results\wechat_laoma_Masked" --image_size 350 --content_weight 0.025 --style_weight 1.0 --total_variation_weight 8.5E-05 --style_scale 1 --num_iter 10 --rescale_image "False" --rescale_method "bicubic" --maintain_aspect_ratio "True" --content_layer "conv5_2" --init_image "content" --pool_type "max" --preserve_color "False" --min_improvement 0 --model "vgg16" --content_loss_type 0 --content_mask "E:\2_github_projects\neural_style_transfer\script_helper\MichaelsScript\images\content\masks\wechat_laoma.png"
Using TensorFlow backend.
2019-11-05 15:38:05.764888: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cudart64_100.dll
2019-11-05 15:38:26.270354: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library nvcuda.dll
2019-11-05 15:38:26.527499: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 745 major: 5 minor: 0 memoryClockRate(GHz): 1.0325
pciBusID: 0000:01:00.0
2019-11-05 15:38:26.553513: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-11-05 15:38:26.619496: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-05 15:38:26.664146: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2
2019-11-05 15:38:26.692800: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce GTX 745 major: 5 minor: 0 memoryClockRate(GHz): 1.0325
pciBusID: 0000:01:00.0
2019-11-05 15:38:26.718565: I tensorflow/stream_executor/platform/default/dlopen_checker_stub.cc:25] GPU libraries are statically linked, skip dlopen check.
2019-11-05 15:38:26.738102: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-11-05 15:38:56.903685: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-11-05 15:38:56.910987: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0
2019-11-05 15:38:56.914011: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N
2019-11-05 15:38:56.960216: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 3046 MB memory) -> physical GPU (device: 0, name: GeForce GTX 745, pci bus id: 0000:01:00.0, compute capability: 5.0)
Model loaded.
Starting iteration 1 of 10
2019-11-05 15:39:04.425131: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library cublas64_100.dll
2019-11-05 15:39:06.650588: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Failed precondition: Error while reading resource variable _AnonymousVar50 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar50/class tensorflow::Var does not exist.
         [[{{node StopGradient_21/ReadVariableOp}}]]
         [[strided_slice_1/end/_23]]
2019-11-05 15:39:06.653494: W tensorflow/core/common_runtime/base_collective_executor.cc:216] BaseCollectiveExecutor::StartAbort Failed precondition: Error while reading resource variable _AnonymousVar50 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar50/class tensorflow::Var does not exist.
         [[{{node StopGradient_21/ReadVariableOp}}]]
Traceback (most recent call last):
  File "./Script/INetwork.py", line 618, in <module>
    x, min_val, info = fmin_l_bfgs_b(evaluator.loss, x.flatten(), fprime=evaluator.grads, maxfun=20)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\scipy\optimize\lbfgsb.py", line 199, in fmin_l_bfgs_b
    **opts)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\scipy\optimize\lbfgsb.py", line 335, in _minimize_lbfgsb
    f, g = func_and_grad(x)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\scipy\optimize\lbfgsb.py", line 285, in func_and_grad
    f = fun(x, *args)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\scipy\optimize\optimize.py", line 326, in function_wrapper
    return function(*(wrapper_args + args))
  File "./Script/INetwork.py", line 562, in loss
    loss_value, grad_values = eval_loss_and_grads(x)
  File "./Script/INetwork.py", line 540, in eval_loss_and_grads
    outs = f_outputs([x])
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\tensorflow_core\python\keras\backend.py", line 3740, in __call__
    outputs = self._graph_fn(*converted_inputs)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\tensorflow_core\python\eager\function.py", line 1081, in __call__
    return self._call_impl(args, kwargs)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\tensorflow_core\python\eager\function.py", line 1121, in _call_impl
    return self._call_flat(args, self.captured_inputs, cancellation_manager)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\tensorflow_core\python\eager\function.py", line 1224, in _call_flat
    ctx, args, cancellation_manager=cancellation_manager)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\tensorflow_core\python\eager\function.py", line 511, in call
    ctx=ctx)
  File "E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\tensorflow_core\python\eager\execute.py", line 67, in quick_execute
    six.raise_from(core._status_to_exception(e.code, message), None)
  File "<string>", line 3, in raise_from
tensorflow.python.framework.errors_impl.FailedPreconditionError: 2 root error(s) found.
  (0) Failed precondition:  Error while reading resource variable _AnonymousVar50 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar50/class tensorflow::Var does not exist.
         [[node StopGradient_21/ReadVariableOp (defined at E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\tensorflow_core\python\framework\ops.py:1751) ]]
         [[strided_slice_1/end/_23]]
  (1) Failed precondition:  Error while reading resource variable _AnonymousVar50 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar50/class tensorflow::Var does not exist.
         [[node StopGradient_21/ReadVariableOp (defined at E:\2_github_projects\neural_style_transfer\script_helper\env\lib\site-packages\tensorflow_core\python\framework\ops.py:1751) ]]
0 successful operations.
0 derived errors ignored. [Op:__inference_keras_scratch_graph_7361]

Function call stack:
keras_scratch_graph -> keras_scratch_graph

If I go into a Python shell and test some stuffs out, here's some results, not sure if helpful:

Python 3.6.6 (v3.6.6:4cf1f54eb7, Jun 27 2018, 03:37:03) [MSC v.1900 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>>
>>> import numpy as np
>>> import imageio
>>> from PIL import Image
>>> from skimage import color
>>> img = np.array(imageio.imread('wechat_laoma.png', pilmode="L"))
(mimicking imread)
>>> img
array([[255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       ...,
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255],
       [255, 255, 255, ..., 255, 255, 255]], dtype=uint8)

* There are some 0's, so basically 0's and 255's.
>>> img[200,100:150]
array([  0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0, 255, 255, 255, 255,
       255, 255, 255, 255, 255, 255, 255, 255, 255, 255, 255], dtype=uint8)
>>> if type(img) != Image:
...     try:
...         img = Image.fromarray(img, mode='RGB')
...     except:
...         img = Image.fromarray(img)
...
>>> img
<PIL.Image.Image image mode=L size=221x295 at 0x1558A897828>
>>>
thimabru1010 commented 4 years ago

Someones solved it?