YYuanAnyVision / mxnet_center_loss

implement center loss operator for mxnet
193 stars 93 forks source link

无法执行此项目 #5

Open Ume07 opened 7 years ago

Ume07 commented 7 years ago

想测试一下center loss 的威力... git clone 下来 ,到 mxnet的项目中把 test/python/common/get_data.py 复制到此项目下, 执行train.py 执行失败

系统 Ubuntu 16.04 x64 CUDA 8.0 Cudnn 6.0 mxnet 0.10.1 and 0.11.0 opencv 3.3

求各路大神帮帮忙 @pangyupo @luoyetx

`[14:48:02] src/io/iter_mnist.cc:94: MNISTIter: load 60000 images, shuffle=1, shape=(100,1,28,28) [14:48:02] src/io/iter_mnist.cc:94: MNISTIter: load 10000 images, shuffle=1, shape=(100,1,28,28) training model ... dev is [gpu(0)] /home/mxnet_center_loss/train_model.py:133: DeprecationWarning: mxnet.model.FeedForward has been deprecated. Please use mxnet.mod.Module instead. *model_args) /home/mxnet/python/mxnet/initializer.py:353: DeprecationWarning: Calling initializer with init(str, NDArray) has been deprecated.please use init(mx.init.InitDesc(...), NDArray) instead. init(name, arr) [14:48:12] /home/mxnet/dmlc-core/include/dmlc/./logging.h:308: [14:48:12] src/pass/gradient.cc:159: Check failed: (rit)->inputs.size() == input_grads.size() (5 vs. 2) Gradient function not returning enough gradient

Stack trace returned 10 entries: [bt] (0) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f28b1f644bc] [bt] (1) /home/mxnet/python/mxnet/../../lib/libmxnet.so(+0x27b1f40) [0x7f28b4021f40] [bt] (2) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt17_Function_handlerIFN4nnvm5GraphES1_EPS2_E9_M_invokeERKSt9_AnydataOS1+0x111) [0x7f28b2c033f1] [bt] (3) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm11ApplyPassesENS_5GraphERKSt6vectorINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x32c) [0x7f28b4053b8c] [bt] (4) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm9ApplyPassENS_5GraphERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x3c9) [0x7f28b2f99389] [bt] (5) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm4pass8GradientENS_5GraphESt6vectorINS_9NodeEntryESaIS3_EES5_S5_St8functionIFS3_OS5_EES6_IFiRKNS_4NodeEEES6_IFS3_RKS3_SG_EES2_IPKNS_2OpESaISL_EENSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x60c) [0x7f28b300780c] [bt] (6) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor13InitFullGraphEN4nnvm6SymbolERKSt6vectorINS_9OpReqTypeESaIS5_EE+0x863) [0x7f28b2ff1283] [bt] (7) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor9InitGraphEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSN_INS_9OpReqTypeESaISS_EE+0x82) [0x7f28b2ff1b52] [bt] (8) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor4InitEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorINS_7NDArrayESaISO_EESS_RKSN_INS_9OpReqTypeESaIST_EESS_PNS_8ExecutorERKSt13unordered_mapINS2_9NodeEntryESO_NS2_13NodeEntryHashENS2_14NodeEntryEqualESaISG_IKS11_SO_EEE+0x76f) [0x7f28b2ffc45f] [bt] (9) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet8Executor4BindEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorINS_7NDArrayESaISN_EESR_RKSM_INS_9OpReqTypeESaISS_EESRPS0+0x1b8) [0x7f28b2ffd7f8]

Traceback (most recent call last): File "./train.py", line 97, in main() File "./train.py", line 94, in main train_model.fit(args, net, (train, val), data_shape) File "/home/mxnet_center_loss/train_model.py", line 153, in fit epoch_end_callback = checkpoint) File "/home/mxnet/python/mxnet/model.py", line 830, in fit sym_gen=self.sym_gen) File "/home/mxnet/python/mxnet/model.py", line 210, in _train_multi_device logger=logger) File "/home/mxnet/python/mxnet/executor_manager.py", line 326, in init self.slices, train_data) File "/home/mxnet/python/mxnet/executor_manager.py", line 238, in init input_types=data_types) File "/home/mxnet/python/mxnet/executor_manager.py", line 184, in _bind_exec grad_req=grad_req, shared_exec=base_exec) File "/home/mxnet/python/mxnet/symbol.py", line 1636, in bind ctypes.byref(handle))) File "/home/mxnet/python/mxnet/base.py", line 102, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: [14:48:12] src/pass/gradient.cc:159: Check failed: (*rit)->inputs.size() == input_grads.size() (5 vs. 2) Gradient function not returning enough gradient

Stack trace returned 10 entries: [bt] (0) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4dmlc15LogMessageFatalD1Ev+0x3c) [0x7f28b1f644bc] [bt] (1) /home/mxnet/python/mxnet/../../lib/libmxnet.so(+0x27b1f40) [0x7f28b4021f40] [bt] (2) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZNSt17_Function_handlerIFN4nnvm5GraphES1_EPS2_E9_M_invokeERKSt9_AnydataOS1+0x111) [0x7f28b2c033f1] [bt] (3) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm11ApplyPassesENS_5GraphERKSt6vectorINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEESaIS7_EE+0x32c) [0x7f28b4053b8c] [bt] (4) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm9ApplyPassENS_5GraphERKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x3c9) [0x7f28b2f99389] [bt] (5) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN4nnvm4pass8GradientENS_5GraphESt6vectorINS_9NodeEntryESaIS3_EES5_S5_St8functionIFS3_OS5_EES6_IFiRKNS_4NodeEEES6_IFS3_RKS3_SG_EES2_IPKNS_2OpESaISL_EENSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEE+0x60c) [0x7f28b300780c] [bt] (6) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor13InitFullGraphEN4nnvm6SymbolERKSt6vectorINS_9OpReqTypeESaIS5_EE+0x863) [0x7f28b2ff1283] [bt] (7) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor9InitGraphEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorIS4_SaIS4_EESR_SR_RKSN_INS_9OpReqTypeESaISS_EE+0x82) [0x7f28b2ff1b52] [bt] (8) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet4exec13GraphExecutor4InitEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEES4_St4lessISD_ESaISt4pairIKSD_S4_EEERKSt6vectorINS_7NDArrayESaISO_EESS_RKSN_INS_9OpReqTypeESaIST_EESS_PNS_8ExecutorERKSt13unordered_mapINS2_9NodeEntryESO_NS2_13NodeEntryHashENS2_14NodeEntryEqualESaISG_IKS11_SO_EEE+0x76f) [0x7f28b2ffc45f] [bt] (9) /home/mxnet/python/mxnet/../../lib/libmxnet.so(_ZN5mxnet8Executor4BindEN4nnvm6SymbolERKNS_7ContextERKSt3mapINSt7cxx1112basic_stringIcSt11char_traitsIcESaIcEEES3_St4lessISC_ESaISt4pairIKSC_S3_EEERKSt6vectorINS_7NDArrayESaISN_EESR_RKSM_INS_9OpReqTypeESaISS_EESRPS0+0x1b8) [0x7f28b2ffd7f8] `

stevenluzheng123456 commented 7 years ago

换到mxnet 0.94

taylover-pei commented 6 years ago

请问这个问题是怎么解决的?我也遇到了类似的情况: [21:05:56] /home/travis/build/dmlc/mxnet-distro/mxnet-build/dmlc-core/include/dmlc/logging.h:308: [21:05:56] src/pass/gradient.cc:159: Check failed: (*rit)->inputs.size() == input_grads.size() (5 vs. 2) Gradient function not returning enough gradient 我的是0.11.0版本,难道必须换成低版本的吗?