Closed adhara123007 closed 5 years ago
Hi,
I am trying to train the network on multiple GPUs but I get the error: F0321 13:50:58.896466 271 parallel.cpp:55] Check failed: total_size == (ptr == buffer ? 1 : ptr - buffer) (118335438 vs. 117426126) Check failure stack trace: @ 0x7fb6d90315cd google::LogMessage::Fail() @ 0x7fb6d9033433 google::LogMessage::SendToLog() @ 0x7fb6d903115b google::LogMessage::Flush() @ 0x7fb6d9033e1e google::LogMessageFatal::~LogMessageFatal() @ 0x7fb6d96cf8fd caffe::GPUParams<>::configure() @ 0x7fb6d96cfd4b caffe::P2PSync<>::P2PSync() @ 0x7fb6d96d1672 caffe::P2PSync<>::Prepare() @ 0x7fb6d96d1cde caffe::P2PSync<>::Run() @ 0x40a80f train() @ 0x4075b8 main @ 0x7fb6d7ae5830 __libc_start_main @ 0x407d29 _start @ (nil) (unknown)
If I am using one GPU (any of the two GPUs in the system), everything seems to run okay.
Sorry, we never used Caffe on multiple GPUs and I have no experience with that.
Thanks
Hi,
I am trying to train the network on multiple GPUs but I get the error: F0321 13:50:58.896466 271 parallel.cpp:55] Check failed: total_size == (ptr == buffer ? 1 : ptr - buffer) (118335438 vs. 117426126) Check failure stack trace: @ 0x7fb6d90315cd google::LogMessage::Fail() @ 0x7fb6d9033433 google::LogMessage::SendToLog() @ 0x7fb6d903115b google::LogMessage::Flush() @ 0x7fb6d9033e1e google::LogMessageFatal::~LogMessageFatal() @ 0x7fb6d96cf8fd caffe::GPUParams<>::configure() @ 0x7fb6d96cfd4b caffe::P2PSync<>::P2PSync() @ 0x7fb6d96d1672 caffe::P2PSync<>::Prepare() @ 0x7fb6d96d1cde caffe::P2PSync<>::Run() @ 0x40a80f train() @ 0x4075b8 main @ 0x7fb6d7ae5830 __libc_start_main @ 0x407d29 _start @ (nil) (unknown)
If I am using one GPU (any of the two GPUs in the system), everything seems to run okay.