nagadomi / waifu2x

Image Super-Resolution for Anime-Style Art
http://waifu2x.udp.jp/
MIT License
27.49k stars 2.71k forks source link

Training model: When I use the "-loss aux_lbp" parameter, it prompts me "lib/AuxiliaryLossCriterion.lua:73:'for' limit must be a number" #357

Closed CarbonPool closed 4 years ago

CarbonPool commented 4 years ago

This is my running statement:

th train.lua -model_dir models/my_model -method noise -noise_level 3 -resume models/my_model/noise3_model.t7 -learning_rate "7.4343563437384e-07" -downsampling_filters "Box,Box,Box,Box,Sinc,Sinc,Sinc,Sinc ,Catrom" -test images/test.tif -backend cudnn -thread 16 -style art -nr_rate 1 -crop_size 66 -validation_crops 64 -patches 16 -batch_size 8 -epoch 20 -max_size 512 -loss aux_lbp -update_criterion loss -oracle_rate 0.1

nagadomi commented 4 years ago

AuxiliaryLoss only supports cunet/upcunet model.

CarbonPool commented 4 years ago

I tried adding parameters, but a serious error message was thrown: th train.lua -model cunet -model_dir models/my_cunetmodel -method noise -noise_level 3 -downsampling_filters "Box,Box,Box,Box,Sinc,Sinc,Sinc,Sinc,Catrom" -test images/test.tif -backend cudnn -style art -nr_rate 0.85 -loss aux_lbp -crop_size 66 -validation_crops 64 -patches 16 -batch_size 8 -epoch 20 -max_size 512 -update_criterion loss -oracle_rate 0.1

1

learning rate: 0.00025

resampling

[======================================== 2323/2323 ==================================>] Tot: 1m20s | Step: 35ms

update

/root/torch/install/bin/luajit: /root/torch/install/share/lua/5.1/nn/Container.lua:67: In 2 module of nn.Sequential: In 1 module of nn.ConcatTable: In 2 module of nn.Sequential: In 1 module of nn.Sequential: In 1 module of nn.ConcatTable: In 3 module of nn.Sequential: In 2 module of nn.Sequential: In 2 module of nn.Sequential: /root/torch/install/share/lua/5.1/nn/CAddTable.lua:16: bad argument #2 to 'add' (sizes do not match at /root/torch/extra/cutorch/lib/THC/generated/../generic/THCTensorMathPointwise.cu:217) stack traceback: [C]: in function 'add' /root/torch/install/share/lua/5.1/nn/CAddTable.lua:16: in function </root/torch/install/share/lua/5.1/nn/CAddTable.lua:9> [C]: in function 'xpcall' /root/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /root/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function </root/torch/install/share/lua/5.1/nn/Sequential.lua:41> [C]: in function 'xpcall' /root/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /root/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function </root/torch/install/share/lua/5.1/nn/Sequential.lua:41> [C]: in function 'xpcall' /root/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' ... /root/torch/install/share/lua/5.1/nn/Container.lua:63: in function 'rethrowErrors' /root/torch/install/share/lua/5.1/nn/Sequential.lua:44: in function 'forward' lib/minibatch_adam.lua:45: in function 'opfunc' /root/torch/install/share/lua/5.1/optim/adam.lua:37: in function 'adam' lib/minibatch_adam.lua:61: in function 'minibatch_adam' train.lua:640: in function 'train' train.lua:718: in main chunk [C]: in function 'dofile' /root/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk [C]: at 0x560fb8b92570

nagadomi commented 4 years ago

For cunet/upcunet model, -crop_size must be a multiple of 4. 66 is not.

CarbonPool commented 4 years ago

I encountered some difficulties, and I don’t know if my training method is appropriate. Because I don’t have a high-end computing GPU, I can only train for 10 rounds a day. After the end, I will remember the last round of "learning_rate" attribute value, one I will run this command some days later:

th train.lua -model cunet -model_dir models/my_cunetmodel -method noise -noise_level 3 -resume models/my_cunetmodel/noise3_model.t7 -learning_rate ${last_learning_rate} -downsampling_filters "Box,Box,Box,Box,Sinc,Sinc,Sinc,Sinc,Catrom" -test "images/test.jpg" -backend cudnn -style art -nr_rate 1 -loss aux_lbp -crop_size 88 -validation_crops 64 -patches 16 -batch_size 12 -epoch 20 -max_size 512 -update_criterion loss -oracle_rate 0.1

At present I have provided 3000 high-definition png animation illustrations. Occasionally I will add some images, then regenerate the converted data, and continue the previous training, but I found that more and more details are lost in noise reduction, maybe Am i doing something wrong?