xialeiliu / RankIQA

The rep for the RankIQA paper in ICCV 2017
https://xialeiliu.github.io/RankIQA/
MIT License
453 stars 117 forks source link

关于Rank训练时在vgg网络中加入BN层的问题。 #34

Open Zzzbang opened 5 years ago

Zzzbang commented 5 years ago

我在rank训练时的train.prototxt中加入了BN层,但是训练时出现以下错误: I0515 03:29:45.312124 2845 upgrade_proto.cpp:61] Successfully upgraded file specified using deprecated V1LayerParameter I0515 03:29:45.328873 2845 upgrade_proto.cpp:67] Attempting to upgrade input file specified using deprecated input fields: ./models/rank_live/VGG_ILSVRC_16_layers.caffemodel I0515 03:29:45.328899 2845 upgrade_proto.cpp:70] Successfully upgraded file specified using deprecated input fields. W0515 03:29:45.328905 2845 upgrade_proto.cpp:72] Note that future Caffe releases will only support input layers and not input fields. I0515 03:29:45.457494 2845 net.cpp:744] Ignoring source layer fc8 I0515 03:29:45.457530 2845 net.cpp:744] Ignoring source layer prob [libprotobuf WARNING google/protobuf/io/coded_stream.cc:537] Reading dangerously large protocol message. If the message turns out to be larger than 2147483647 bytes, parsing will be halted for security reasons. To increase the limit (or to disable these warnings), see CodedInputStream::SetTotalBytesLimit() in google/protobuf/io/coded_stream.h. [libprotobuf WARNING google/protobuf/io/coded_stream.cc:78] The total number of bytes read was 553432081 I0515 03:29:46.020968 2845 upgrade_proto.cpp:53] Attempting to upgrade input file specified using deprecated V1LayerParameter: ./models/rank_live/VGG_ILSVRC_16_layers.caffemodel I0515 03:29:46.906415 2845 upgrade_proto.cpp:61] Successfully upgraded file specified using deprecated V1LayerParameter I0515 03:29:46.923141 2845 upgrade_proto.cpp:67] Attempting to upgrade input file specified using deprecated input fields: ./models/rank_live/VGG_ILSVRC_16_layers.caffemodel I0515 03:29:46.923157 2845 upgrade_proto.cpp:70] Successfully upgraded file specified using deprecated input fields. W0515 03:29:46.923163 2845 upgrade_proto.cpp:72] Note that future Caffe releases will only support input layers and not input fields. I0515 03:29:47.035661 2845 net.cpp:744] Ignoring source layer fc8 I0515 03:29:47.035687 2845 net.cpp:744] Ignoring source layer prob I0515 03:29:47.076709 2845 caffe.cpp:248] Starting Optimization I0515 03:29:47.076766 2845 solver.cpp:272] Solving RankIQA_siamese_train_test I0515 03:29:47.076776 2845 solver.cpp:273] Learning Rate Policy: step I0515 03:29:47.105406 2845 solver.cpp:330] Iteration 0, Testing net (#0) terminate called after throwing an instance of 'boost::python::error_already_set' *** SETTING UP* Aborted at 1557890987 (unix time) try "date -d @1557890987" if you are using GNU date PC: @ 0x7fbdedee9428 gsignal SIGABRT (@0xb1d) received by PID 2845 (TID 0x7fbdf04a5ac0) from PID 2845; stack trace: *** @ 0x7fbdedee94b0 (unknown) @ 0x7fbdedee9428 gsignal @ 0x7fbdedeeb02a abort @ 0x7fbdee52384d gnu_cxx::verbose_terminate_handler() @ 0x7fbdee5216b6 (unknown) @ 0x7fbdee521701 std::terminate() @ 0x7fbdee521919 cxa_throw @ 0x7fbdd8a5b4c2 boost::python::throw_error_already_set() @ 0x7fbd24b6eeb1 caffe::PythonLayer<>::Forward_cpu() @ 0x7fbdef77b727 caffe::Net<>::ForwardFromTo() @ 0x7fbdef77bab7 caffe::Net<>::Forward() @ 0x7fbdef76ebaa caffe::Solver<>::Test() @ 0x7fbdef76f82e caffe::Solver<>::TestAll() @ 0x7fbdef7725d2 caffe::Solver<>::Step() @ 0x7fbdef7733ea caffe::Solver<>::Solve() @ 0x40ee7a train() @ 0x40b8a3 main @ 0x7fbdeded4830 libc_start_main @ 0x40c249 _start @ 0x0 (unknown) Aborted (core dumped) 已经检查数据集没有问题,只有增加BN层才会出现以上报错,是因为bn改变了分布所以需要重写loss.py吗? 感谢你的回复!

xialeiliu commented 5 years ago

loss和网络是没有关系的。

Zzzbang commented 5 years ago

loss和网络是没有关系的。

那这里报错的原因可能是因为输入存在问题吗?但我去除bn层网络可以正常训练。

Zzzbang commented 5 years ago

loss和网络是没有关系的。

I0515 03:59:13.925062 3424 layer_factory.hpp:77] Creating layer data I0515 03:59:13.925108 3424 net.cpp:84] Creating Layer data I0515 03:59:13.925127 3424 net.cpp:380] data -> data I0515 03:59:13.925137 3424 net.cpp:380] data -> label SETTING UP I0515 03:59:14.120211 3424 net.cpp:122] Setting up data 我检查test部分时发现出现了这个问题,在loss函数里面我找到了相关定义: def setup(self, bottom, top): self.margin = 10 print ' SETTING' pass 可以问下这里加入BN层为什么会对test的输入造成影响吗?