facebookresearch / Detectron

FAIR's research platform for object detection research, implementing popular algorithms like Mask R-CNN and RetinaNet.
Apache License 2.0
26.24k stars 5.45k forks source link

*** Aborted at 1525746556 (unix time) try "date -d @1525746556" if you are using GNU date *** #578

Open pengyongrong opened 6 years ago

pengyongrong commented 6 years ago

json_stats: {"accuracy_cls": 1.000000, "eta": "1:43:54", "iter": 19980, "loss": 0.023408, "loss_bbox": 0.002010, "loss_cls": 0.000491, "loss_mask": 0.020870, "loss_rpn_bbox_fpn2": 0.000000, "loss_rpn_bbox_fpn3": 0.000202, "loss_rpn_bbox_fpn4": 0.000000, "loss_rpn_bbox_fpn5": 0.000000, "loss_rpn_bbox_fpn6": 0.000000, "loss_rpn_cls_fpn2": 0.000006, "loss_rpn_cls_fpn3": 0.000009, "loss_rpn_cls_fpn4": 0.000001, "loss_rpn_cls_fpn5": 0.000000, "loss_rpn_cls_fpn6": 0.000000, "lr": 0.000100, "mb_qsize": 64, "mem": 3295, "time": 0.155789} Error in python': free(): corrupted unsorted chunks: 0x00007f3e9009cdd0 ======= Backtrace: ========= /lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7f41e4cac7e5] /lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7f41e4cb537a] /lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7f41e4cb953c] /usr/local/lib/python2.7/dist-packages/numpy/core/multiarray.so(+0x10fd2a)[0x7f41dc3a6d2a] python(PyEval_EvalFrameEx+0x6345)[0x4ca345] python(PyEval_EvalFrameEx+0x5d8f)[0x4c9d8f] python(PyEval_EvalCodeEx+0x255)[0x4c2765] python[0x4de6fe] python(PyObject_Call+0x43)[0x4b0cb3] python[0x4f492e] python(PyObject_Call+0x43)[0x4b0cb3] python(PyEval_CallObjectWithKeywords+0x30)[0x4ce5d0] /mnt/data1/pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so(+0x83df0)[0x7f41d9658df0] /mnt/data1/pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so(+0x85571)[0x7f41d965a571] /mnt/data1/pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so(+0x4c6cb)[0x7f41d96216cb] /mnt/data1/pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so(+0x98f48)[0x7f41d966df48] /mnt/data1/pytorch/build/caffe2/python/caffe2_pybind11_state_gpu.so(+0x95375)[0x7f41d966a375] /mnt/data1/pytorch/build/lib/libcaffe2.so(_ZN6caffe26DAGNet5RunAtEiRKSt6vectorIiSaIiEE+0x114)[0x7f41d8a318d4] /mnt/data1/pytorch/build/lib/libcaffe2.so(_ZN6caffe210DAGNetBase14WorkerFunctionEv+0x2f5)[0x7f41d8a306a5] /usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xb8c80)[0x7f41de77fc80] /lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7f41e50066ba] /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f41e4d3c41d] ======= Memory map: ======== 00400000-006ea000 r-xp 00000000 08:02 2360539 /usr/bin/python2.7 008e9000-008eb000 r--p 002e9000 08:02 2360539 /usr/bin/python2.7 008eb000-00962000 rw-p 002eb000 08:02 2360539 /usr/bin/python2.7 00962000-00985000 rw-p 00000000 00:00 0 0295d000-0ffb4000 rw-p 00000000 00:00 0 [heap] 10000000-10001000 rw-s 00000000 00:06 27734 /dev/nvidia1 10001000-10002000 rw-s 00000000 00:06 27734 /dev/nvidia1

Aborted at 1525746556 (unix time) try "date -d @1525746556" if you are using GNU date PC: @ 0x7f41e4c6a428 gsignal

SIGABRT (@0x3eb00000d4b) received by PID 3403 (TID 0x7f4071bc7700) from PID 3403; stack trace: @ 0x7f41e5010390 (unknown) @ 0x7f41e4c6a428 gsignal @ 0x7f41e4c6c02a abort @ 0x7f41e4cac7ea (unknown) @ 0x7f41e4cb537a (unknown) @ 0x7f41e4cb953c cfree @ 0x7f41dc3a6d2a array_subscript @ 0x4ca345 PyEval_EvalFrameEx @ 0x4c9d8f PyEval_EvalFrameEx @ 0x4c2765 PyEval_EvalCodeEx @ 0x4de6fe (unknown) @ 0x4b0cb3 PyObject_Call @ 0x4f492e (unknown) @ 0x4b0cb3 PyObject_Call @ 0x4ce5d0 PyEval_CallObjectWithKeywords @ 0x7f41d9658df0 pybind11::detail::object_api<>::operator()<>() @ 0x7f41d965a571 caffe2::python::PythonOpBase<>::RunOnDevice() @ 0x7f41d96216cb caffe2::Operator<>::Run() @ 0x7f41d966df48 _ZN6caffe213GPUFallbackOpINS_6python8PythonOpINS_10CPUContextELb0EEENS_11SkipIndicesIJEEEE11RunOnDeviceEv @ 0x7f41d966a375 caffe2::Operator<>::Run() @ 0x7f41d8a318d4 caffe2::DAGNet::RunAt() @ 0x7f41d8a306a5 caffe2::DAGNetBase::WorkerFunction() @ 0x7f41de77fc80 (unknown) @ 0x7f41e50066ba start_thread @ 0x7f41e4d3c41d clone @ 0x0 (unknown)

pengyongrong commented 6 years ago

is there anybody meeting the same question , i'm so worry about it please give some advice for me ! thank you

gadcam commented 6 years ago

Looks like a duplicate of/linked with #536, #380, #431, #359, #420, #578 & #314. (and also https://github.com/caffe2/caffe2/issues/1985 & https://github.com/pytorch/pytorch/issues/6542)

@ir413, @rbgirshick it could be worth it to open a new issue to gather all the information in one place to solve this issue for good.

pengyongrong commented 6 years ago

thanks for your reply, i have try it following #536,#380,#431,#359,#420,#578&#314and so on .but it have not work.i have see a news about deteceron and giving the solution of it (link:http://www.infosec-wiki.com/?p=434345) but i don't know where should be replaced or modifyed in the file of subprocess.py

shengtao96 commented 6 years ago

I think pfollmann's answer in #415 can solve your problem. It works for me.

pengyongrong commented 6 years ago

i have try it ,but when i do it following #415 by pfollmann's, another errors occurs which is Import Error:No module named utils.argsort what should i do next

shengtao96 commented 6 years ago

Delete 'utils.' Add the Code: import argsort as argsort

pengyongrong commented 6 years ago

i try it by import argsort as argsort . the error is ImportError:No module named argsort .is there path should being modify???

shengtao96 commented 6 years ago

You should Place argsort.pyx in detectron/utils

pengyongrong commented 6 years ago

yes i have done it and make under the folder of detectron.

shengtao96 commented 6 years ago

Please check your setup.py and make sure whether you run 'make'.

pengyongrong commented 6 years ago

i founded the problem,i have use error name in the cython_nms.pyx

pengyongrong commented 6 years ago

Thank you for your patience, but I have another problem now. import:dynamic module does not define init function

shengtao96 commented 6 years ago

Please give me more details.

pengyongrong commented 6 years ago

Traceback (most recent call last): File "tools/train_net.py", line 38, in from detectron.core.test_engine import run_inference File "/home/ubuntu/detectron/detectron/core/test_engine.py", line 35, in from detectron.core.rpn_generator import generate_rpn_on_dataset File "/home/ubuntu/detectron/detectron/core/rpn_generator.py", line 42, in from detectron.datasets import task_evaluation File "/home/ubuntu/detectron/detectron/datasets/task_evaluation.py", line 47, in import detectron.datasets.json_dataset_evaluator as json_dataset_evaluator File "/home/ubuntu/detectron/detectron/datasets/json_dataset_evaluator.py", line 33, in import detectron.utils.boxes as box_utils File "/home/ubuntu/detectron/detectron/utils/boxes.py", line 52, in import detectron.utils.cython_nms as cython_nms File "detectron/utils/cython_nms.pyx", line 26, in init detectron.utils.cython_nms import detectron.utils.argsort as argsort ImportError: dynamic module does not define init function (initargsort)

shengtao96 commented 6 years ago

I'm sorry that I didn't meet your problem. You should follow the pfollmann's answer and rechange the Detectron code.

pengyongrong commented 6 years ago

ok thanks for your patience

shenghsiaowong commented 6 years ago

hi,have you solve this problem?imeet the same problem import detectron.utils.argsort as argsort ImportError: dynamic module does not define init function (initargsort)

dqzyg commented 5 years ago

Specify an empty GPU card.

example

export CUDA_VISIBLE_DEVICES=0 #