ctensmeyer / binarization_2017

BSD 3-Clause "New" or "Revised" License
28 stars 9 forks source link

binarization_dibco container fails #3

Open masyagin1998 opened 5 years ago

masyagin1998 commented 5 years ago

Hello! I'm trying to run your binarization_dibco.py tool in nvidia-docker container and getting following error:

F1021 20:23:03.879026 1 math_functions.cu:81] Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered Check failure stack trace: Loading Image Computing RD features Concating inputs Preprocessing (489, 2625, 4) Tiling input Starting predictions for network 1/5 Progress 2% Progress 5% Progress 7% Progress 10% Progress 12% Progress 15% Progress 17% Progress 20% Progress 23% Progress 25% Progress 28% Progress 30% Progress 33% Progress 35% Progress 38% Progress 41% Progress 43% Progress 46% Progress 48% Progress 51% Progress 53% Progress 56% Progress 58% Progress 61% Progress 64% Progress 66% Progress 69% Progress 71% Progress 74% Progress 76% Progress 79% Progress 82% Progress 84% Progress 87% Progress 89% Progress 92% Progress 94% Progress 97% Progress 100% Reconstructing whole image from binarized tiles Starting predictions for network 2/5.

binarization_plm.py doesn't fail and works great :)

ctensmeyer commented 5 years ago

It's possible I have a bug somewhere in my Caffe code that causes the illegal access on some GPUs. What GPU are you using?

On Sun, Oct 21, 2018 at 2:27 PM Mikhail Masyagin notifications@github.com wrote:

Hello! I'm trying to run your binarization_dibco.py tool in nvidia-docker container and getting following error:

F1021 20:23:03.879026 1 math_functions.cu:81] Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered Check failure stack trace: Loading Image Computing RD features Concating inputs Preprocessing (489, 2625, 4) Tiling input Starting predictions for network 1/5 Progress 2% Progress 5% Progress 7% Progress 10% Progress 12% Progress 15% Progress 17% Progress 20% Progress 23% Progress 25% Progress 28% Progress 30% Progress 33% Progress 35% Progress 38% Progress 41% Progress 43% Progress 46% Progress 48% Progress 51% Progress 53% Progress 56% Progress 58% Progress 61% Progress 64% Progress 66% Progress 69% Progress 71% Progress 74% Progress 76% Progress 79% Progress 82% Progress 84% Progress 87% Progress 89% Progress 92% Progress 94% Progress 97% Progress 100% Reconstructing whole image from binarized tiles Starting predictions for network 2/5.

binarization_plm.py doesn't fail and works great :)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/ctensmeyer/binarization_2017/issues/3, or mute the thread https://github.com/notifications/unsubscribe-auth/AC2mpNX3vH_Tf-X1gEMCFp9Ri-tBlnaTks5unNiTgaJpZM4Xypq4 .

masyagin1998 commented 5 years ago

My GPU is Nvidia 1050 Ti 4GB.

masyagin1998 commented 5 years ago

Tried to run without GPU, but ran out of memory: I1023 03:51:57.924324 1 net.cpp:259] Network initialization done. I1023 03:51:57.924329 1 net.cpp:260] Memory required for data: 503054336 F1023 03:52:49.289590 1 syncedmem.hpp:33] Check failed: *ptr host allocation of size 5435817984 failed Check failure stack trace: Loading Image Computing RD features Concating inputs Preprocessing (722, 1087, 4) Tiling input Starting predictions for network 1/5

I have 8 GIBs of RAM and Intel Core I7-7700HQ. My system is Linux Mint 19. When I started container, I had 6.5 GIBs of free RAM.

AndyCheang commented 5 years ago

Hello!

I am using docker to run with DIBCO2017 dataset but failure and got a similar issue.

I0410 07:38:22.774055 1 net.cpp:259] Network initialization done. I0410 07:38:22.774060 1 net.cpp:260] Memory required for data: 503054336 F0410 07:38:22.959889 1 math_functions.cu:81] Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered Check failure stack trace: Loading Image Computing RD features Concating inputs Preprocessing (1350, 2397, 4) Tiling input Starting predictions for network 1/5

vkolagotla commented 5 years ago

Hello, I have the same problem as mentioned above. I've changed the batch size to 8 (Issue https://github.com/shihenw/convolutional-pose-machines-release/issues/6#issuecomment-218475367)and it's working for most of the images(although giving washed out text) but it's still failing for some images.

Anyone found a solution?