Closed zdhernandez closed 2 years ago
What was the error?
What was the error? @hkchengrex attached pictures
Can you try --mem_profile 2
?
There are also a lot of frames so maybe you need to set --mem_freq
higher, e.g., 30.
Can you try
--mem_profile 2
?
@hkchengrex seems to work. Do you think you have memory leaks ? Because the video I'm trying to use is only 34MB. I see what it is doing now in:
class InferenceCore:
"""
images - leave them in original dimension (unpadded), but do normalize them.
Should be CPU tensors of shape B*T*3*H*W
mem_profile - How extravagant I can use the GPU memory.
Usually more memory -> faster speed but I have not drawn the exact relation
0 - Use the most memory
1 - Intermediate, larger buffer
2 - Intermediate, small buffer
3 - Use the minimal amount of GPU memory
Note that *none* of the above options will affect the accuracy
This is a space-time tradeoff, not a space-performance one
mem_freq - Period at which new memory are put in the bank
Higher number -> less memory usage
Unlike the last option, this *is* a space-performance tradeoff
"""
34MB is the size after compression. Raw pixels occupy much more space. The model in this branch https://github.com/hkchengrex/MiVOS/tree/MiVOS-STCN will perform better/use somewhat less memory I believe.
34MB is the size after compression. Raw pixels occupy much more space. The model in this branch https://github.com/hkchengrex/MiVOS/tree/MiVOS-STCN will perform better/use somewhat less memory I believe.
@hkchengrex that's the branch I'm using. So what do we need to do to control the memory. I see the YouTube videos which are long and high resolution. What's different there ? How was the input of a large resolution video adjusted to the tool without you know crashing the GPU memory or causing memory issues ? Was a much more powerful hardware used ? or multiple GPUs ? I would like to do 30 seconds of a 4k video at 30FPs. But it doesn't have to be 4k, it can be 1080. Still 30 seconds at 30FPs.
None of the youtube videos that we show are that long (<500 frames). We just downsample high-res video to 480p (probably like what you did). We used a 2080Ti (11GB) and that should be sufficient for all the examples shown.
For your setting, I would suggest a higher mem_freq
and use 480p videos to generate the masks (upsample afterward).
None of the youtube videos that we show are that long (<500 frames). We just downsample high-res video to 480p (probably like what you did). We used a 2080Ti (11GB) and that should be sufficient for all the examples shown.
For your setting, I would suggest a higher
mem_freq
and use 480p videos to generate the masks (upsample afterward).
@hkchengrex I see. So a higher 'mem_freq' with video down-sampled for MIVOS + STCN processing and then once finished the masks are up-sampled as post-processing to match the dimension of the original 1080 or 4K video. Does that sound right ?
Yes.
Yes.
@hkchengrex got it! Thank you very much!
I tried the MIVOS + STCN on a 1.5 minute 4k video that was down sampled to 480p and the program crashed.
What are the steps to reformat/sample a 4k video to make it work for this tool?
Also can this tool run on multiple GPUs?