Open ProGamerGov opened 6 years ago
Using variations (only changing the -image_size
value) of these two commands, I have noticed that VGG models with their FC layers removed, use less memory:
th neural_style.lua -backend cudnn -model_file models/VGG_ILSVRC_16_layers.caffemodel -proto_file models/VGG_ILSVRC_16_layers_deploy.prototxt
th neural_style.lua -backend cudnn -model_file models/vgg16.caffemodel -proto_file models/vgg16.prototxt
-image_size 512
:
With FC Layers: 1684MiB
No FC Layers: 1423MiB
Difference: 261MiB
-image_size 1024
:
With FC Layers: 4687MiB
No FC Layers: 4512MiB
Difference: 175MiB
-image_size 1536
:
With FC Layers: 9937MiB
No FC Layers: 9702MiB
Difference: 235MiB
The VGG models with their FC Layers removed, come from here:
https://style-transfer.s3-us-west-2.amazonaws.com/vgg16.caffemodel
https://style-transfer.s3-us-west-2.amazonaws.com/vgg19.caffemodel
The prototxt files that I used with these models are from: https://github.com/crowsonkb/style_transfer
The VGG-16 model without the FC Layers is 56.1MB in size, while the VGG-16 model with it's FC Layers is 528MB in size.
Testing the idea idea with the VGG-16 SOD Fine-tune model:
The full model is 514MB in size. With the FC Layers stripped from the model, it's only 56.1MB in size. With both the FC Layers and all the Relu/Conv layers down to relu5_1 stripped from the model, it is 38.1MB in size. Stripping all the layers down to relu4_2, results in a model size of 20.1MB.
VGG layers are hierarchical, so in theory removing layers above the ones that Neural-Style is using, shouldn't negatively effect things. It also means that you can only strip off layers down to the high one that you are using.
Control Tests:
Stripped Model Tests:
The GPU usage for the above examples with an -image_size
of 1024 is shown below:
Model | Model Size (MB) | __Total Usage__ | LuaJIT Usage |
---|---|---|---|
Control | 514MB | 3922MiB / 11439MiB |
3911MiB |
Control Without Relu5_1 | 514MB | 3897MiB / 11439MiB |
3886MiB |
Shaved Off FC Layers | 56.1MB | 3646MiB / 11439MiB |
3635MiB |
Shaved to relu5_1 | 38.1MB | 3610MiB / 11439MiB |
3599MiB |
Shaved to relu4_2 | 20.1MB | 3516MiB / 11439MiB |
3505MiB |
Comparing total usage:
Compared to the full model, shaving off the FC Layers saves 276MiB
of GPU memory.
Compared to the full model, shaving off all the layers down to relu5_1 saves 312MiB
of GPU memory.
Compared to the full model test without relu5_1, shaving off all the layers down to relu5_1 saves 381MiB
of GPU memory.
Simply omitting the relu5_1 style layer only saves 25MiB
of GPU memory.
So it seems that shaving off layers from a VGG model results in less GPU memory used, than simply just changing the -style_layers
or -content_layers
values.
The impact on style transfer quality in relation to stripping/removing layers, seems to resemble messing with the layer activation strengths like in my experiments here, or just simply changing the -seed
value. But if the layers above the ones that you are using affect the style transfer outputs, then removing them could negatively impact quality. Farther testing would probably allow us to see if this is the case or not.
To "shave/strip" the model, I ran this command in caffe:
./build/tools/caffe train -solver models/vgg16_finetune/solver.prototxt -weights models/vgg16_finetune/VGG16_SOD_finetune.caffemodel -gpu 0
In the solver.prototxt, I made the the learning rate was set to zero, and only one iteration was used before saving the model:
base_lr: 0.000000
max_iter: 1
I then also simply deleted the lines for the layers I wanted to remove in the train_val.protoxt file, as per this suggestion: https://github.com/BVLC/caffe/issues/186#issuecomment-37141696
This way of stripping layers should be possible for NIN models as well, but I have no idea how much it would improve the performance of the model.
NIN-Imagenet Model Tests (nin_imagenet_conv.caffemodel
):
-content_layers relu1,relu2,relu3,relu5,relu6,relu7 -style_layers relu1,relu2,relu3,relu5,relu6,relu7
The GPU usage for the above examples with an -image_size
of 1024 is shown below:
Model | Model Size (MB) | __Total Usage__ | LuaJIT Usage |
---|---|---|---|
Control | 28.9MB | 1993MiB / 11439MiB |
1982MiB |
Shaved to relu7 | 6.42MB | 1974MiB / 11439MiB |
1963MiB |
While the NIN model lost 22.48MiB
(Just under 78%), only 19MiB
of GPU memory was saved.
Compared to the VGG models, the output didn't change as drastically when layers were removed.
When using
-image_size 2432
,-image_size 2560
, and-image_size 2816
, with-backend cudnn
,-optimizer adam
, and-style_scale 0.5
, the loss values seem to remain the same in every iteration. Lower image sizes don't seem to suffer from this issue.I also used
-gpu 0,1,2,3,4,5,6,7 -multigpu_strategy 2,3,4,6,8,11,12
, which is the most efficient set of parameters for multiple GPUs that I have come across thus far.What is happening here, and is it possible to fix this?
Here's the
nvidia-smi
output:Edit:
This also happened with a second content/style image combo at the same image size values.