crowsonkb / style_transfer

Data-parallel image stylization using Caffe.
MIT License
113 stars 14 forks source link

Error when trying to use ResNet model #5

Closed ProGamerGov closed 7 years ago

ProGamerGov commented 7 years ago

The command used:

./style_transfer.py inputs/hoovertowernight.jpg inputs/starry_night.jpg --net-type resnet --model resnet_50_1by2_nsfw.caffemodel --tile-size 1920 -s 400 800 -i 150 -cw 0.1 -pw 5 --content-layers conv4_1 conv4_2 conv4_3 conv4_4 --style-layers conv1_1 pool1 pool2 pool3 pool4 2>&1 | tee ~/mylog.log

The terminal output:

Loading VGG_ILSVRC_19_layers.caffemodel.
[libprotobuf ERROR google/protobuf/text_format.cc:296] Error parsing text-format caffe.NetParameter: 2:1: Invalid control characters encountered in text.
[libprotobuf ERROR google/protobuf/text_format.cc:296] Error parsing text-format caffe.NetParameter: 2:29: Interpreting non ascii codepoint 162.
[libprotobuf ERROR google/protobuf/text_format.cc:296] Error parsing text-format caffe.NetParameter: 2:29: Message type "caffe.NetParameter" has no field named "ResNet_50_1by2_inplace_nsfw".
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1013 00:07:33.795737  1351 upgrade_proto.cpp:88] Check failed: ReadProtoFromTextFile(param_file, param) Failed to parse NetParameter file: resnet_50_1by2_nsfw.caffemodel
*** Check failure stack trace: ***

The model was from here: https://github.com/yahoo/open_nsfw. The train_val info can be found here: https://github.com/yahoo/open_nsfw/issues/8

I experienced this error on the style_tranfer AMI.

crowsonkb commented 7 years ago

--model is for the .prototxt file and --weights is for the .caffemodel. Also, you probably want to run --list-layers because the content and style layers in ResNets aren't named the same as VGG nets.

Also, 5 is a really high value for -pw. Earlier today I changed how it worked to make it a consistent strength when the value of -pp changed. With the default -pp of 6, you should divide the old parameter value by 32 to get the same effect. I also changed the default -pw from 1 to 0.05 at the same time, which is a bit stronger.

ProGamerGov commented 7 years ago

Also, 5 is a really high value for -pw. Earlier today I changed how it worked to make it a consistent strength when the value of -pp changed. With the default -pp of 6, you should divide the old parameter value by 32 to get the same effect. I also changed the default -pw from 1 to 0.05 at the same time, which is a bit stronger.

I was just copying your parameters from the Reddit thread as I am not sure as to what the optimal parameters.

crowsonkb commented 7 years ago

Normally I leave the p-norm weight at the default unless I'm doing something involving Deep Dream. It applies a penalty to pixel values that are too far away from the ImageNet mean value, encouraging them to roughly stay in the interval 0-255. It's not a hard constraint - it tends to push deep blacks and bright whites toward the mean a bit, and pixel values are still allowed outside 0-255 - but it plays much nicer with Deep Dream than a hard constraint, moderating its often-garish contrast and allowing it to converge to a solution rather than diverging. Generally the higher the Deep Dream weight, the more you want to raise the p-norm weight from the default.

ProGamerGov commented 7 years ago

How do I properly use --list-layers?


Edit: I really need some more sleep, these mistakes are just embarrassing. I messed up the giant sting of command parameters while editing it, and was pasting it in all messed up. I'll have to mess around with the model tomorrow when I have more time available.

crowsonkb commented 7 years ago

I think for this model you have to use the conv_ layers instead of scale_ or bn_. If you look at the output of --list-layers (you can just add that flag to style_transfer.py to print the layers then exit, it's like --help), some of the layers will be 1D (like (1000,)) and some will be 3D (like (512, 28, 28)). Only the 3D layers can be used. This should really be documented and also have an informative error message. The 'out of range' error is due to it trying to read the size of the second dimension when it doesn't exist for that layer.

ProGamerGov commented 7 years ago

Only the 3D layers can be used. This should really be documented and also have an informative error message. The 'out of range' error is due to it trying to read the size of the second dimension when it doesn't exist for that layer.

Ok, I'll try that out tomorrow when I get the chance! Didn't realize I was dealing with different 3D and 2D layers, as I haven't dealt with 2D vs 3D layers before (Is it a ResNet thing?). Thanks for the help!

crowsonkb commented 7 years ago

Apparently this model has a lot of layers without separately allocated memory, like the scale and bn layers which perform their operations in-place. These layers are present in deploy.prototxt but won't even show up in --list-layers, since they can't be used at all.

Most models have 1D layers at the end, for instance VGG has fc6 and fc7 IIRC. Anything past pool5 on VGG nets is 1D.

Actually, I can see another problem with this network. style_transfer needs to know the last usable layer for a model and selects it based on the model type. It expects ResNets to have their last usable layer named pool5, but in this model it is simply named pool. This will cause another crash. I'll have to add a flag to specify the last layer name.

crowsonkb commented 7 years ago

I fixed the other problem. style_transfer will now autodetect the last layer. I removed the --net-type option since it was only used for determining the last layer.

ProGamerGov commented 7 years ago

@crowsonkb Would it be possible to allow the Deepdream to work on GPU as well as CPU?


The --list-layers output:

Loading resnet_50_1by2_nsfw.caffemodel.
Layers:
                   conv_1 (64, 112, 112)
                    pool1 (64, 56, 56)
conv_stage0_block0_proj_shortcut (128, 56, 56)
conv_stage0_block0_branch2a (32, 56, 56)
conv_stage0_block0_branch2b (32, 56, 56)
conv_stage0_block0_branch2c (128, 56, 56)
    eltwise_stage0_block0 (128, 56, 56)
conv_stage0_block1_branch2a (32, 56, 56)
conv_stage0_block1_branch2b (32, 56, 56)
conv_stage0_block1_branch2c (128, 56, 56)
    eltwise_stage0_block1 (128, 56, 56)
conv_stage0_block2_branch2a (32, 56, 56)
conv_stage0_block2_branch2b (32, 56, 56)
conv_stage0_block2_branch2c (128, 56, 56)
    eltwise_stage0_block2 (128, 56, 56)
conv_stage1_block0_proj_shortcut (256, 28, 28)
conv_stage1_block0_branch2a (64, 28, 28)
conv_stage1_block0_branch2b (64, 28, 28)
conv_stage1_block0_branch2c (256, 28, 28)
    eltwise_stage1_block0 (256, 28, 28)
conv_stage1_block1_branch2a (64, 28, 28)
conv_stage1_block1_branch2b (64, 28, 28)
conv_stage1_block1_branch2c (256, 28, 28)
    eltwise_stage1_block1 (256, 28, 28)
conv_stage1_block2_branch2a (64, 28, 28)
conv_stage1_block2_branch2b (64, 28, 28)
conv_stage1_block2_branch2c (256, 28, 28)
    eltwise_stage1_block2 (256, 28, 28)
conv_stage1_block3_branch2a (64, 28, 28)
conv_stage1_block3_branch2b (64, 28, 28)
conv_stage1_block3_branch2c (256, 28, 28)
    eltwise_stage1_block3 (256, 28, 28)
conv_stage2_block0_proj_shortcut (512, 14, 14)
conv_stage2_block0_branch2a (128, 14, 14)
conv_stage2_block0_branch2b (128, 14, 14)
conv_stage2_block0_branch2c (512, 14, 14)
    eltwise_stage2_block0 (512, 14, 14)
conv_stage2_block1_branch2a (128, 14, 14)
conv_stage2_block1_branch2b (128, 14, 14)
conv_stage2_block1_branch2c (512, 14, 14)
    eltwise_stage2_block1 (512, 14, 14)
conv_stage2_block2_branch2a (128, 14, 14)
conv_stage2_block2_branch2b (128, 14, 14)
conv_stage2_block2_branch2c (512, 14, 14)
    eltwise_stage2_block2 (512, 14, 14)
conv_stage2_block3_branch2a (128, 14, 14)
conv_stage2_block3_branch2b (128, 14, 14)
conv_stage2_block3_branch2c (512, 14, 14)
    eltwise_stage2_block3 (512, 14, 14)
conv_stage2_block4_branch2a (128, 14, 14)
conv_stage2_block4_branch2b (128, 14, 14)
conv_stage2_block4_branch2c (512, 14, 14)
    eltwise_stage2_block4 (512, 14, 14)
conv_stage2_block5_branch2a (128, 14, 14)
conv_stage2_block5_branch2b (128, 14, 14)
conv_stage2_block5_branch2c (512, 14, 14)
    eltwise_stage2_block5 (512, 14, 14)
conv_stage3_block0_proj_shortcut (1024, 7, 7)
conv_stage3_block0_branch2a (256, 7, 7)
conv_stage3_block0_branch2b (256, 7, 7)
conv_stage3_block0_branch2c (1024, 7, 7)
    eltwise_stage3_block0 (1024, 7, 7)
conv_stage3_block1_branch2a (256, 7, 7)
conv_stage3_block1_branch2b (256, 7, 7)
conv_stage3_block1_branch2c (1024, 7, 7)
    eltwise_stage3_block1 (1024, 7, 7)
conv_stage3_block2_branch2a (256, 7, 7)
conv_stage3_block2_branch2b (256, 7, 7)
conv_stage3_block2_branch2c (1024, 7, 7)
    eltwise_stage3_block2 (1024, 7, 7)
                     pool (1024, 1, 1)
                  fc_nsfw (2,)
                     prob (2,)

I tested the various layers with 3 numbers below with the following command. All the layers I tried seemed to result in the same error.


./style_transfer.py inputs/creek.jpg inputs/shipwreck.jpg --no-browser --model deploy.prototxt --weights resnet_50_1by2_nsfw.caffemodel --content-layers conv_stage1_block2_branch2b --style-layers conv_stage1_block2_branch2b --tile-size 2048 -s 200 300 400 600 800 1200 --hidpi -i 200 100 -cw 0.5 -tw 0.5 -tp 1.5 --dd-layers conv_stage1_block2_branch2b -dw 1 2>&1 | tee ~/mylog.log

Loading resnet_50_1by2_nsfw.caffemodel.

Watch the progress at: http://127.0.0.1:8000/

Starting 1 worker process(es).
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1013 22:42:31.182435  1539 blob.cpp:32] Check failed: shape[i] >= 0 (-1 vs. 0) 
*** Check failure stack trace: ***
ProGamerGov commented 7 years ago

Here the same error message appears to be from an issue with kernel size: https://stackoverflow.com/questions/37494722/caffe-lenet-error-check-failed-shapei-0-1-vs-0

ProGamerGov commented 7 years ago

Not sure if this is related, but it appears that the VGG_SOD_Finetune model does not work with it's deploy.prototxt but works if you omit the --model VGG16_SOD_finetune_deploy.prototxt from the list of parameters used. Both the default and the VGG_SOD_Finetune are VGG16 models.

./style_transfer.py inputs/creek.jpg inputs/shipwreck.jpg --no-browser --weights VGG16_SOD_finetune.caffemodel --content-layers conv4_1 conv4_2 conv4_3 conv4_4 --style-layers conv1_1 pool1 pool2 pool3 pool4 --dd-layers conv4_1 conv4_2 --tile-size 2048 -s 200 300 400 600 800 1200 --hidpi -i 200 100 -cw 0.5 -tw 0.5 -tp 1.5 -dw 1 2>&1 | tee ~/mylog.log


Loading VGG16_SOD_finetune.caffemodel.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:569] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 538683157

Watch the progress at: http://127.0.0.1:8000/

Starting 1 worker process(es).
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:569] Reading dangerously large protocol message.  If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons.  To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h.
[libprotobuf WARNING google/protobuf/io/coded_stream.cc:81] The total number of bytes read was 538683157
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1013 23:14:46.062955  1899 inner_product_layer.cpp:64] Check failed: K_ == new_K (25088 vs. 17920) Input size incompatible with inner product parameters.
*** Check failure stack trace: ***
crowsonkb commented 7 years ago

Does the ImageNet-trained ResNet in ../deep_dream/resnet work for you? And that's really weird: the default model is VGG19. The VGG16 .caffemodel and .prototxt are included on my AMI though, so you can use the VGG16 prototxt with different VGG16 caffemodels if you have to. I wonder if that'll work for ResNets with renamed layers too.

ProGamerGov commented 7 years ago

@crowsonkb

Does the ImageNet-trained ResNet in ../deep_dream/resnet work for you?

Yes, it works using:

./style_transfer.py --no-browser --model ../deep_dream/resnet/ResNet-50-deploy.prototxt --weights ../deep_dream/resnet/ResNet-50-model.caffemodel --content-layers res4b --style-layers res2b res3b res4b --tile-size 2048 inputs/creek.jpg inputs/shipwreck.jpg -s 200 300 400 600 800 1200 --hidpi -i 200 100 -cw 0.5 -tw 0.5 -tp 1.5 --dd-layers res4b -dw 1

ProGamerGov commented 7 years ago

This command works with the Yahoo NSFW model: ./style_transfer.py --no-browser --model ../deep_dream/resnet/ResNet-50-deploy.prototxt --weights resnet_50_1by2_nsfw.caffemodel --content-layers res4b --style-layers res2b res3b res4b --tile-size 2048 inputs/creek.jpg inputs/shipwreck.jpg -s 200 300 400 600 800 1200 --hidpi -i 200 100 -cw 0.5 -tw 0.5 -tp 1.5 --dd-layers res4b -dw 1

I am wondering if there is an issue with the deploy.prototxt from both of these models, or from your code?

Edit: The output is completely grey and the output file size was 23KB.

ProGamerGov commented 7 years ago

Trying to use another ResNet_50_1by2 prototxt results in the following error:

./style_transfer.py --list-layers --no-browser --model train.prototxt --weights resnet_50_1by2_nsfw.caffemodel --content-layers pool --style-layers pool --tile-size 2048 inputs/creek.jpg inputs/shipwreck.jpg -s 200 300 400 600 800 1200 --hidpi -i 200 100 -cw 0.5 -tw 0.5 -tp 1.5 --dd-layers pool -dw 1 2>&1 | tee ~/mylog.log

Loading resnet_50_1by2_nsfw.caffemodel.
Process ForkProcess-1:
Traceback (most recent call last):
  File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
    self.run()
  File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "./style_transfer.py", line 910, in init_model
    shapes[layer] = model.data[layer].shape
  File "./style_transfer.py", line 141, in __getitem__
    return getattr(self.net.blobs[key], self.attr)[0]
IndexError: too many indices for array

I also fixed my train_val_nsfw prototxt file: https://gist.github.com/ProGamerGov/2dedda3ad769fbe322e9b5db63db7763

ProGamerGov commented 7 years ago

Just tried the MIT Places365 Hybrid model. It didn't work with it's prototxt, but it did work with the default VGG_ILSVRC_16_layers_deploy.prototxt you included with style_transfer.

./style_transfer.py --no-browser --model VGG_ILSVRC_16_layers_deploy.prototxt --weights vgg16_hybrid1365.caffemodel --content-layers pool5 --style-layers pool5 --tile-size 2048 inputs/creek.jpg inputs/shipwreck.jpg -s 200 300 400 600 800 1200 --hidpi -i 200 100 -cw 0.5 -tw 0.5 -tp 1.5 --dd-layers pool5 -dw 1 2>&1 | tee ~/mylog.log

ProGamerGov commented 7 years ago

@crowsonkb It appears there may be an issue with non default prototxt files.

crowsonkb commented 7 years ago

I remember now that I had to modify the prototxts to get them to work... The ones in deep_dream and style_transfer are all my modified versions. You have to add the force_backward: true line near the top. I'd also remove fully-connected layers and loss layers from the end so that the last layer would be a valid one for style_transfer.

ProGamerGov commented 7 years ago

Adding force_backward: true results in

Loading resnet_50_1by2_nsfw.caffemodel.

Watch the progress at: http://127.0.0.1:8000/

Starting 1 worker process(es).
WARNING: Logging before InitGoogleLogging() is written to STDERR
F1014 06:31:11.061403 28320 memory_data_layer.cpp:110] Check failed: data_ MemoryDataLayer needs to be initialized by calling Reset
*** Check failure stack trace: ***

And the deploy.prototxt already has force_backward: true . Are there any other edits you made to the prototxt files?

ProGamerGov commented 7 years ago

In your Deepdream project, the default deploy.prototxt are for ResNet-50, ResNet-101, and ResNet-152. I was curious as to whether or not the 50-1by2 was a different ResNet architecture.

The model in question that I am attempting to use is a ResNet-50-1by2.


Resnet_50 is a default train_val.prototxt for ResNet-50

ResNet-50 is your modified ResNet-50 deploy.prototxt

ResNet_50_1by2_nsfw is the deploy.prototxt for the Open_NSFW model.


There are 2319 removals 3500 additions between your ResNet-50 deploy.prototxt and the unmodified Resnet_50 train.prototxt.


Comparing the deploy.prototxt for ResNet-50 (Your Modified version), and the deploy.prototxt for the ResNet-50-1by2 model I am attempting to use with a diff checker tool, shows the code is vastly different.

There is 3488 removals and 2319 additions between them.


The difference between ResNet_50_1by2_nsfw and Resnet_50_1by2 appear to be the use_global_stats: value:

use_global_stats: true use_global_stats: true

Comparing the two files shows 91 removals and 93 additions between them.

The very top portion also appears to be different between them.

layer {
  name: "data"
  type: "MemoryData"
  top: "data"
  top: "label"
  transform_param {
    crop_size: 224
  }
  memory_data_param {
    batch_size: 1
    channels: 3
    height: 224
    width: 224
  }
}
name: "ResNet_50_1by2_nsfw"
layer {
  name: "data"
  type: "Input"
  top: "data"
  input_param { shape: { dim: 1 dim: 3 dim: 224 dim: 224 } }

The differences between Resnet_50 and Resnet_50_1by2 are:

Resnet_50 has:

mean_value: 104     
mean_value: 117     
mean_value: 123

while Resnet_50_1by2 does not.

The num_output: value between the two makes up the other difference. In total there are 56 removals and 53 additions between the two.


ResNet_50_1by2_nsfw compared with Resnet_50 has 143 removals and 148 additions


It seems your modified prototxt files are vastly different to the unmodified prototxt files that Open_NSFW and other ResNet models use modified versions of. These differences could be at the heart of the issues I am experiencing. The Open_NSFW model also has a highly non-standard layer called "Scale", which has a learn-able bias parameter.

crowsonkb commented 7 years ago

Try using this prototxt for ResNet_50_1by2_nsfw: https://gist.github.com/crowsonkb/aa45e710f9169b23947cac9fe493c4e3

I had to set force_backward: true on line 2 and change the final pooling layer from 7x7 average pooling to global pooling (for compatibility with different image sizes). I modified it from the nsfw model's deploy.prototxt (deploy will generally be closest to what style_transfer needs).

crowsonkb commented 7 years ago

Re: the scale layer, from http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1BatchNormLayer.html:

Note that the original paper also included a per-channel learned bias and scaling factor. To implement this in Caffe, define a ScaleLayer configured with bias_term: true after each BatchNormLayer to handle both the bias and scaling factor.

ProGamerGov commented 7 years ago

@crowsonkb The modified deploy.prototxt works!

Here are the eltwise layers (Potentially NSFW Images): https://imgur.com/a/NtFyx

The Pool and some Conv layers (Potentially NSFW Images): https://imgur.com/a/h3Lgn

The results from the upper most layers seem to have a grassy/hair like fibrous look. Layers from the lower middle and upwards seem to be where you find objects and structures. It's also interesting to note the model is only 22.7 MB in size, unlike many other models used for Deepdream.