Closed GYengera closed 7 years ago
My issue is solved now. I had previously included the parameter param { lr_mult: 10 decay_mult: 1 }
in my convolutional lstm cells. Once I removed it, the backpropagation started to work. I would like to be able to multiply the learning rate of the parameters in my convolutional lstm cell though.
I have also observed that my GPU utilization averages only aroung 20-30% and the network is training a bit slower than without the convolutional lstm. Have others observed the convolutional lstm cell to slow down network training?
Hello,
I was able to successfully build the latest version of caffe with convLSTM layer. I copied the src and include files, modified the caffe.proto file and made the required modifications in the makefile. I am using cuda-8.0.
When I run my model (sort of an AlexNet version of the model described in this paper), the test phase runs okay and so does the forward run of the first train iteration. During the backward path I get the following error:
I0915 18:07:56.650460 1749 solver.cpp:397] Test net output #0: accuracy = 0.124847 I0915 18:07:56.650481 1749 solver.cpp:397] Test net output #1: loss = 1.9918 ( 1 = 1.9918 loss) I0915 18:08:19.280226 1749 solver.cpp:218] Iteration 0 (0 iter/s, 88.3729s/10 iters), loss = 0.555055 I0915 18:08:19.280261 1749 solver.cpp:237] Train net output #0: loss = 0 ( 1 = 0 loss) I0915 18:08:19.280269 1749 sgd_solver.cpp:105] Iteration 0, lr = 0.0001 F0915 18:08:19.288806 1749 blob.cpp:195] Syncedmem not initialized. Check failure stack trace: @ 0x7fafacbbc5cd google::LogMessage::Fail() @ 0x7fafacbbe433 google::LogMessage::SendToLog() @ 0x7fafacbbc15b google::LogMessage::Flush() @ 0x7fafacbbee1e google::LogMessageFatal::~LogMessageFatal() @ 0x7fafad38212b caffe::Blob<>::Update() @ 0x7fafad1e7d95 caffe::Net<>::Update() @ 0x7fafad33feee caffe::SGDSolver<>::ApplyUpdate() @ 0x7fafad395ed6 caffe::Solver<>::Step() @ 0x7fafad39694a caffe::Solver<>::Solve() @ 0x40abf9 train() @ 0x40743e main @ 0x7fafac1cc830 __libc_start_main @ 0x407c99 _start @ (nil) (unknown)
Any idea on what exactly is causing the problem?