hpi-xnor / BMXNet

(New version is out: https://github.com/hpi-xnor/BMXNet-v2) BMXNet: An Open-Source Binary Neural Network Implementation Based on MXNet
Apache License 2.0
350 stars 95 forks source link

Running SSD with binary models #34

Closed VintonyPadmadiredja closed 5 years ago

VintonyPadmadiredja commented 6 years ago

Hi,

I'm wondering if it's possible to run SSD within the examples using binary models and if so, how would one do it?

Thank you.

yanghaojin commented 6 years ago

We will try to offer an example, pls stay tuned.

jacky4323 commented 6 years ago

Hi @yanghaojin,

there are some problems, in your folder BMXNet/example can't run correctly if I use GPU,but official incubator-mxnet packages can run correctly on GPU

Could you please update BMXNet packages some operators by incubator-mxnet(some new version official operator )

Any help would be appericated! Thanks a lot!

yanghaojin commented 6 years ago

Hi jacky4323, The underlying MXNet version is v1.0.0 which is the latest release, could u pls offer more detailed information on your problem:

jacky4323 commented 6 years ago

Hi @yanghaojin,

Thanks for your response! I use pip install official MXNet can run the below code However,BMXNet which built from source by cmake can't run the below code

export MXNET_CUDNN_AUTOTUNE_DEFAULT=0 export PYTHONUNBUFFERED=1 python train_end2end.py --gpu

src/operator/contrib/proposal.cu:495: Check failed: error == cudaSuccess (7 vs. 0) too many resources requested for launch

Stack trace returned 10 entries: [bt] (0) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f7b83783e9c] [bt] (1) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet2op13ProposalGPUOpIN7mshadow3gpuEE7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS9_EERKS8_INS_9OpReqTypeESaISE_EESDSD+0x12b9) [0x7f7b865822c9] [bt] (2) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet2op13OperatorState7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS6_EERKS5_INS_9OpReqTypeESaISBEESA+0x36d) [0x7f7b839ef4ed] [bt] (3) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet4exec23StatefulComputeExecutor3RunENS_10RunContextEb+0x69) [0x7f7b838f6e69] [bt] (4) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(+0x992210) [0x7f7b838bb210] [bt] (5) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x93) [0x7f7b837b2a83] [bt] (6) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0x10b) [0x7f7b837bb89b] [bt] (7) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_AnydataOS5+0x63) [0x7f7b837bbac3] [bt] (8) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x4a) [0x7f7b837b522a] [bt] (9) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7f7bac782c80]

[13:04:31] /home/jacky4323/BMXNet_v3/mxnet/dmlc-core/include/dmlc/logging.h:308: [13:04:31] /home/jacky4323/BMXNet_v3/mxnet/src/engine/./threaded_engine.h:359: [13:04:31] /home/jacky4323/BMXNet_v3/mxnet/src/operator/contrib/proposal.cu:495: Check failed: error == cudaSuccess (7 vs. 0) too many resources requested for launch

Stack trace returned 10 entries: [bt] (0) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f7b83783e9c] [bt] (1) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet2op13ProposalGPUOpIN7mshadow3gpuEE7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS9_EERKS8_INS_9OpReqTypeESaISE_EESDSD+0x12b9) [0x7f7b865822c9] [bt] (2) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet2op13OperatorState7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS6_EERKS5_INS_9OpReqTypeESaISBEESA+0x36d) [0x7f7b839ef4ed] [bt] (3) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet4exec23StatefulComputeExecutor3RunENS_10RunContextEb+0x69) [0x7f7b838f6e69] [bt] (4) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(+0x992210) [0x7f7b838bb210] [bt] (5) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x93) [0x7f7b837b2a83] [bt] (6) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0x10b) [0x7f7b837bb89b] [bt] (7) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_AnydataOS5+0x63) [0x7f7b837bbac3] [bt] (8) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x4a) [0x7f7b837b522a] [bt] (9) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7f7bac782c80]

A fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.

Stack trace returned 8 entries: [bt] (0) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f7b83783e9c] [bt] (1) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x36b) [0x7f7b837b2d5b] [bt] (2) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0x10b) [0x7f7b837bb89b] [bt] (3) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_AnydataOS5+0x63) [0x7f7b837bbac3] [bt] (4) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x4a) [0x7f7b837b522a] [bt] (5) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7f7bac782c80] [bt] (6) /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f7bb30886ba] [bt] (7) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f7bb2dbe3dd]

terminate called after throwing an instance of 'dmlc::Error' what(): [13:04:31] /home/jacky4323/BMXNet_v3/mxnet/src/engine/./threaded_engine.h:359: [13:04:31] /home/jacky4323/BMXNet_v3/mxnet/src/operator/contrib/proposal.cu:495: Check failed: error == cudaSuccess (7 vs. 0) too many resources requested for launch

Stack trace returned 10 entries: [bt] (0) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f7b83783e9c] [bt] (1) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet2op13ProposalGPUOpIN7mshadow3gpuEE7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS9_EERKS8_INS_9OpReqTypeESaISE_EESDSD+0x12b9) [0x7f7b865822c9] [bt] (2) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet2op13OperatorState7ForwardERKNS_9OpContextERKSt6vectorINS_5TBlobESaIS6_EERKS5_INS_9OpReqTypeESaISBEESA+0x36d) [0x7f7b839ef4ed] [bt] (3) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet4exec23StatefulComputeExecutor3RunENS_10RunContextEb+0x69) [0x7f7b838f6e69] [bt] (4) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(+0x992210) [0x7f7b838bb210] [bt] (5) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x93) [0x7f7b837b2a83] [bt] (6) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0x10b) [0x7f7b837bb89b] [bt] (7) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_AnydataOS5+0x63) [0x7f7b837bbac3] [bt] (8) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x4a) [0x7f7b837b522a] [bt] (9) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7f7bac782c80]

A fatal error occurred in asynchronous engine operation. If you do not know what caused this error, you can try set environment variable MXNET_ENGINE_TYPE to NaiveEngine and run with debugger (i.e. gdb). This will force all operations to be synchronous and backtrace will give you the series of calls that lead to this error. Remember to set MXNET_ENGINE_TYPE back to empty after debugging.

Stack trace returned 8 entries: [bt] (0) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f7b83783e9c] [bt] (1) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine14ThreadedEngine15ExecuteOprBlockENS_10RunContextEPNS0_8OprBlockE+0x36b) [0x7f7b837b2d5b] [bt] (2) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZN5mxnet6engine23ThreadedEnginePerDevice9GPUWorkerILN4dmlc19ConcurrentQueueTypeE0EEEvNS_7ContextEbPNS1_17ThreadWorkerBlockIXT_EEESt10shared_ptrINS0_10ThreadPool11SimpleEventEE+0x10b) [0x7f7b837bb89b] [bt] (3) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt17_Function_handlerIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEZZNS2_23ThreadedEnginePerDevice13PushToExecuteEPNS2_8OprBlockEbENKUlvE1_clEvEUlS5_E_E9_M_invokeERKSt9_AnydataOS5+0x63) [0x7f7b837bbac3] [bt] (4) /home/jacky4323/BMXNet_v3/mxnet/python/mxnet/../../build/Release/libmxnet.so(_ZNSt6thread5_ImplISt12_Bind_simpleIFSt8functionIFvSt10shared_ptrIN5mxnet6engine10ThreadPool11SimpleEventEEEES8_EEE6_M_runEv+0x4a) [0x7f7b837b522a] [bt] (5) /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80) [0x7f7bac782c80] [bt] (6) /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba) [0x7f7bb30886ba] [bt] (7) /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d) [0x7f7bb2dbe3dd]

Aborted (core dumped)

yanghaojin commented 5 years ago

please check our new version BMXNet v2: https://github.com/hpi-xnor/BMXNet-v2