Closed r1d1shka closed 1 year ago
Hi @r1d1shka
It looks like race condition or memory object lifetime issues. May I ask you what's the reason behind this code? It'm just curious.
cuda.cudart.cudaSetDeviceFlags(cuda.cudart.cudaDeviceScheduleBlockingSync)
cuda.cudart.cudaDeviceSynchronize()
Also, there's Surface > Tensor
conversion which you can follow:
https://github.com/NVIDIA/VideoProcessingFramework/blob/3347e555ed795ba7de98b4e6b9bf7fe441784663/samples/SampleTorchResnet.py#L1129-L1136
Beside that, you can convert your tensor back to nv12 surface and feed it to PyNvEncoder
instead of tossing frames between RAM and vRAM and CPU-accelerated color conversion:
process = ffmpeg.input('pipe:',
format='rawvideo',
pix_fmt='rgb24',
s='{}x{}'.format(frames[0].shape[2], frames[0].shape[1]))\
.output(path, pix_fmt='yuv420p', vcodec='libx264', crf=1)\
.overwrite_output()\
.run_async(pipe_stdin=True, quiet=False)
P. S. May I ask you what's the use case? Rarely you can see a zebra on web camera )))
Thanks for reply, Roman. About
cuda.cudart.cudaSetDeviceFlags(cuda.cudart.cudaDeviceScheduleBlockingSync)
cuda.cudart.cudaDeviceSynchronize()
-- this is just an attempt to solve the synchronization problem. With or without these lines, the result is broken.
About
surf_plane = rgb24_planar.PlanePtr()
img_tensor = pnvc.makefromDevicePtrUint8(
surf_plane.GpuMem(),
surf_plane.Width(),
surf_plane.Height(),
surf_plane.Pitch(),
surf_plane.ElemSize(),
)
Thank you, this allows me to make the code more compact and clearer, but the result is still broken.
About
process = ffmpeg.input('pipe:',
format='rawvideo',
pix_fmt='rgb24',
s='{}x{}'.format(frames[0].shape[2], frames[0].shape[1]))\
.output(path, pix_fmt='yuv420p', vcodec='libx264', crf=1)\
.overwrite_output()\
.run_async(pipe_stdin=True, quiet=False)
Thank you again, but writing is just for debug purposes. I can use opencv imwrite for every frame like this:
t = tensor.permute(1, 2, 0)
cpu_data = t.detach().cpu().numpy()
cv.imwrite(path, cpu_data)
but it doesn't help to solve the synchronization problem...
About your question about zebra -- I took this pictures just for fun :)
Hi @r1d1shka
Coming back to the original topic - based on your code snippet it looks like your project does the same thing as https://github.com/NVIDIA/VideoProcessingFramework/blob/master/samples/SamplePyTorch.py Did you try it?
Yep, sample works fine. Finally, I found mistake. Need to fix this code
rgb24_planar = self.to_pln.Execute(rgb24_small, self.cc_ctx)
if rgb24_planar.Empty():
raise RuntimeError("Cannot convert rgb to plain")
like this (add magic Clone()):
rgb24_planar = self.to_pln.Execute(rgb24_small, self.cc_ctx)
if rgb24_planar.Empty():
raise RuntimeError("Cannot convert rgb to plain")
rgb24_planar = rgb24_planar.Clone()
Currently, I can't understand why this fix works. Do you have any ideas?)
Hi @r1d1shka
There's nothing magical to it ))
Color converter class instance self.to_pln
allocates memory just for 1 output frame to reduce the vRAM footprint.
If you want a deep copy you need to clone it. Otherwise it will simply reference one and the same Surface which actually belongs to color converter. Hence the shifts in the areas with movement.
To my best knowledge, there's no way to work this around because pybind11 relies on shared_ptr
or unique_ptr
to actual C++ class instances for memory management. That's the reason behind the Clone
method which gives you a deep copy managed by Python interpreter, not underlying C++ libraries.
If you have any ideas on how to improve this memory management behavior - please LMK, I'm be happy to improve the codebase.
Thank you very much, I think issue can be closed
Hi @r1d1shka I can't close because I don't have moderator access but I assume you can do it as originator.
closed
Hi!
I'm trying to implement Python tool for reading/writing video, but faced with the fact that when reading there is a color shift at the edges of moving objects. For example:
Should be:
Full input/output videos link: https://drive.google.com/drive/folders/14Ja-pkASKReuD_OVh2ARR0FhGB0lTT8x?usp=sharing
Full test code:
It is very interesting that if you add a pause when reading frames, then the problem disappears (like this):
I'm working on Ubuntu 20.04 with Conda environment. Some GPU information: