chenhang98 / BPR

code for `Look Closer to Segment Better: Boundary Patch Refinement for Instance Segmentation`
Apache License 2.0
173 stars 23 forks source link

RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 1289945088 bytes. Error code 12 (Cannot allocate memory) #22

Closed apanand14 closed 2 years ago

apanand14 commented 2 years ago

Hello,

I'm trying to preapre datset for running BPR. But I'm facing this error. Can you please help me out in this matter? thank you in advance.

*Traceback (most recent call last): File "/home/Anaconda3/envs/lib/python3.7/multiprocessing/pool.py", line 121, in worker result = (True, func(args, kwds)) File "./tools/split_patches.py", line 199, in run_inst dets = get_dets(maskdt, args.patch_size, args.iou_thresh) File "./tools/split_patches.py", line 102, in get_dets fbmask = find_float_boundary(maskdt) File "./tools/split_patches.py", line 55, in find_float_boundary stride=1, padding=width//2).permute(1, 0, 2, 3) RuntimeError: [enforce fail at CPUAllocator.cpp:64] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 1289945088 bytes. Error code 12 (Cannot allocate memory)

chenhang98 commented 2 years ago

It seems that the (CPU) memory is insufficient. We tested on a machine with 188G memory.

apanand14 commented 2 years ago

Thank you for your answer! I somehow managed to solve this by creating another enviornment and installing everything again. This might be the multiprocessing error and might be corrupted so it threw this error. But thank you!

apanand14 commented 2 years ago

@tinyalpha I got another issue in the last stage of inference( merge patches). I have successfully created .pkl file but when i run merge patches then I face this error. if possible then please look into this. Thank you in advance.

Traceback (most recent call last): File "./tools/merge_patches.py", line 104, in start() File "./tools/merge_patches.py", line 74, in start for r in p.imap_unordered(run_inst, enumerate(dt)): File "/home/Anaconda3/envs/closemmlab/lib/python3.8/multiprocessing/pool.py", line 868, in next raise value KeyError: 1300000 (closemmlab) []$ python tools/merge_patches.py refinemask_r50.val.segm.json /Data/annotations/instances_val2017.json refinemask_r50/refined.pkl refinemask_r50/patches/detail_dir/val refinemask_r50/refined.json loading annotations into memory... Done (t=0.00s) creating index... index created! 0%| | 0/679 [00:00<?, ?it/s] multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/Anaconda3/envs/closemmlab/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, kwds)) File "tools/merge_patches.py", line 36, in run_inst patch_mask = results[pid] KeyError: 200000** """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "tools/merge_patches.py", line 104, in start() File "tools/merge_patches.py", line 74, in start for r in p.imap_unordered(run_inst, enumerate(dt)): File "/home/Anaconda3/envs/closemmlab/lib/python3.8/multiprocessing/pool.py", line 868, in next raise value KeyError: 200000 (closemmlab) []$ python tools/merge_patches.py refinemask_r50.val.segm.json /Data/annotations/instances_val2017.json refinemask_r50/refined.pkl refinemask_r50/patches/detail_dir/val refinemask_r50/refined.json loading annotations into memory... Done (t=0.01s) creating index... index created! 0%| | 0/679 [00:00<?, ?it/s] multiprocessing.pool.RemoteTraceback: """ Traceback (most recent call last): File "/home/Anaconda3/envs/closemmlab/lib/python3.8/multiprocessing/pool.py", line 125, in worker result = (True, func(*args, kwds)) File "tools/merge_patches.py", line 36, in run_inst patch_mask = results[pid] KeyError: 700000** """

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "tools/merge_patches.py", line 104, in start() File "tools/merge_patches.py", line 74, in start for r in p.imap_unordered(run_inst, enumerate(dt)): File "/home/Anaconda3/envs/closemmlab/lib/python3.8/multiprocessing/pool.py", line 868, in next raise value KeyError: 700000

chenhang98 commented 2 years ago

This may be because the .pkl file and refinemask_r50/patches/detail_dir/val do not match. The results of these patches (indicated by these pids) are missing in the .pkl file.

apanand14 commented 2 years ago

Thank you for answer. I created patches again and it worked. Thanks once again.