natethegreate / hent-AI

Automation of censor bar detection
MIT License
1.58k stars 145 forks source link

Cannot decensor large video files #12

Closed 13579resu closed 4 years ago

13579resu commented 4 years ago

While testing the decensoring of a large video file (906MB, length: 22:16), I have run into an issue where the program stops working while throwing an error that is has reached a memory limit of about 450MB. Previous testing with a much lighter file (7.25MB, length: 0:27) reveals that the program does indeed work as intended (provided you use a workaround for issue #11).

This is far below what my machine should be capable of handling, I would at least have expected it to do a few gigabytes, given that I have 32GB of RAM and 11GB of VRAM.

PC specs:

CPU-Z full PC specs: specs.txt

Here's the relevant commandline output:

2020-06-29 14:05:00.125065: I tensorflow/core/common_runtime/bfc_allocator.cc:816] Sum Total of in-use chunks: 437.27MiB
2020-06-29 14:05:00.127624: I tensorflow/core/common_runtime/bfc_allocator.cc:818] total_region_allocated_bytes_: 458508544 memory_limit_: 458508696 available bytes: 152 curr_region_allocation_bytes_: 917017600
2020-06-29 14:05:00.131488: I tensorflow/core/common_runtime/bfc_allocator.cc:824] Stats:
Limit:                   458508696
InUse:                   458508544
MaxInUse:                458508544
NumAllocs:                     908
MaxAllocSize:             51380224

2020-06-29 14:05:00.135537: W tensorflow/core/common_runtime/bfc_allocator.cc:319] ****************************************************************************************************
2020-06-29 14:05:00.138928: W tensorflow/core/framework/op_kernel.cc:1479] OP_REQUIRES failed at constant_op.cc:77 : Resource exhausted: OOM when allocating tensor of shape [] and type float
2020-06-29 14:05:00.141894: E tensorflow/core/common_runtime/executor.cc:641] Executor failed to create kernel. Resource exhausted: OOM when allocating tensor of shape [] and type float
         [[{{node roi_align_mask_1/truediv/x}}]]

Full output: output.txt

natethegreate commented 4 years ago

Hello,

Try to close out as many processes as possible that are consuming VRAM. I reccommend segmenting videos into clips, namely clips with visible mosaics. The resolution will also have an affect, if try lower resolution videos as well

13579resu commented 4 years ago

Hi,

Splitting the video into segments works, it's just a pain in the ass to work with. Even then, a limit of only 450MB seems unjustified, there should be an option to increase this limit.

Also, this program was the only program consuming a noticeable amount of VRAM at the time of testing

natethegreate commented 4 years ago

Can you view the Task Manager's Performance tab, and see what the dedicated GPU memory usage is? My 2060 super utilizes 6.7 to 6.8GB of vram.

Keep in mind that loading and propogating any frame or image through a convolutional nueral net takes a massive amount of memory, that is not something I can change. The typical research gpus can have up to 12-16gb of vram available. This is why it is essential that you close out all possible background apps.

Otherwise, if its not the vram allocation then it could be some general memory allocation issue for the long video. Because each frame of the video must be processed, I highly recommend trimming the video to avoid making the AI waste processing on uncensored clips.

13579resu commented 4 years ago

Idle GPU usage: Screenshot (1319)

GPU usage while decensoring: Screenshot (1320)

Here's what copying the performance tab outputs:

GPU 0

    NVIDIA GeForce RTX 2080 Ti

    Driver version: 26.21.14.4166
    Driver date:    06/12/2019
    DirectX version:    12 (FL 12.1)
    Physical location:  PCI bus 1, device 0, function 0

    Utilisation 4%
    Dedicated GPU memory    1,8/11,0 GB
    Shared GPU memory   0,1/16,0 GB
    GPU Memory  1,8/27,0 GB
13579resu commented 4 years ago

The frustrating thing about this is that this error occurs after a bunch of hours, somewhere amidst the process. There is not even a hint about what those 450MB would refer to, since neither the GPU nor the RAM are particularly busy with this task

natethegreate commented 4 years ago

This is why you should edit the clips (Windows default has a video trimmer), because many of those hours are spent on trying to decensor a bunch of scenes that have nothing to censor. I don't know why your utilization is so low, but there are two phases to decensoring:

The first phase uses ESRGAN and performs resizing to calculate a decensored approximation, then the second phase determines what area needs to be decensored. This might be using your CPU, as there are some issues with this old ESRGAN architecture, CUDA, and the rtx Turing architecture.

The second phase uses MaskRCNN, which i have better optimized for GPU usage.

Either case, try out the Google Colab notebook for videos, you may get assigned a Tesla P100 which is quite powerful. If the esrgan workaround issue is still present, you can still edit code from the colab.

13579resu commented 4 years ago

Well I guess colab doesn't like me: Screenshot (1321)

13579resu commented 4 years ago

Just to be clear: I have already done the decensoring through splitting the video, with both ESRGAN and DCP, where DCP has better results, but is more time consuming to work with. It's just an annoyance to not be able to put the whole thing in and simply wait.

13579resu commented 4 years ago

In case you're interested in the results: https://filetransfer.io/data-package/at4Ir1Kq

13579resu commented 4 years ago

It seems that it is phase 2 where the GPU really has to work: Screenshot (1323)

GPU 0

    NVIDIA GeForce RTX 2080 Ti

    Driver version: 26.21.14.4166
    Driver date:    06/12/2019
    DirectX version:    12 (FL 12.1)
    Physical location:  PCI bus 1, device 0, function 0

    Utilisation 5%
    Dedicated GPU memory    10,6/11,0 GB
    Shared GPU memory   1,0/16,0 GB
    GPU Memory  11,6/27,0 GB

as it seems, this works fine for small files, but not for large ones

13579resu commented 4 years ago

Apparently I was wrong after all, it did finish the task with a large file: Screenshot (1325)

Still, this raises the question, what the error on my first attempt was about and if this error occurs randomly. I have no clue what those 450 MB are referring to.