zllrunning / video-object-removal

Just draw a bounding box and you can remove the object you want to remove.
MIT License
2.66k stars 474 forks source link

Out of memory? #19

Open ulatekh opened 4 years ago

ulatekh commented 4 years ago

I've been learning CUDA and pytorch just so that I could run this project. (Doing so has been something of a trial by fire.)

I built my own pytorch from the repo's v0.4.0 tag, and have it running (partially) on two machines, both running Fedora Core 30: one with a Quadro P2000 with 4 GB of main memory, 5 GB of video memory, using SM 6.0, CUDA 9.1, and gcc 5.1.1, and another machine with an RTX 2060 with 32 GB of main memory, 6 GB of video memory, using SM 6.0/7.0, CUDA 9.2 (10.1 had terrible build problems with pytorch 0.4.0), and gcc 6.2.1.

Both machines can run the data/bag.avi test, but when I try to run the data/Human6 test, once it gets to the inpainting part, the RTX 2060 machine gets this:

THCudaCheck FAIL file=$(PYTORCH)/aten/src/THC/generic/THCStorage.cu line=58 error=2 : out of memory Traceback (most recent call last): File "demo.py", line 15, in inpaint(args) File "$(VOM)/inpaint.py", line 96, in inpaint inputs = to_var(inputs) File "$(VOM)/inpainting/utils.py", line 170, in to_var x = x.cuda() RuntimeError: cuda runtime error (2) : out of memory at $(PYTORCH)/aten/src/THC/generic/THCStorage.cu:58

The Quadro P2000 machine fails the inpainting part of the data/Human6 test with:

ERROR: Unexpected bus error encountered in worker. This might be caused by insufficient shared memory (shm). Traceback (most recent call last): File "demo.py", line 15, in inpaint(args) File "$(VOM)/inpaint.py", line 74, in inpaint for seq, (inputs, masks, info) in enumerate(DTloader): File "$(PYTORCH)/utils/data/dataloader.py", line 280, in next idx, batch = self._get_batch() File "$(PYTORCH)/utils/data/dataloader.py", line 259, in _get_batch return self.data_queue.get() File "/usr/lib64/python3.7/multiprocessing/queues.py", line 352, in get res = self._reader.recv_bytes() File "/usr/lib64/python3.7/multiprocessing/connection.py", line 216, in recv_bytes buf = self._recv_bytes(maxlength) File "/usr/lib64/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/usr/lib64/python3.7/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) File "$(PYTORCH)/utils/data/dataloader.py", line 178, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid

Insufficient RAM, I guess? Any insight into what part is so memory-intensive, and what could be done about it?

Assuming these problems are surmountable...do you know if the algorithm is amenable to removing something that doesn't appear in the first frame, and that fades in/out? My first intended project is to remove the credit text from this video.

Thank you for any insights into these issues!