Latency using SharedMemRingBufferRGB on Livefeed

bobbych commented 5 years ago

@elsampsa we are experiencing latency issues live rtsp feed (1080p, 2fps) when trying to pull frames on client side using SharedMemRingBufferRGB

Filtergraph

(LiveThread:livethread) --> {InfoFrameFilter:live_out_filter} -->> (AVThread:avthread) --> {SwsScaleFrameFilter:sws_scale_filter} --> {RGBShmemFrameFilter:shmem_filter}

Client using SharedMemRingBufferRGB and version 0.12.0

index_p = new_intp()
rgb_meta = RGB24Meta()
rb = SharedMemRingBufferRGB(name, buffers, width, height, 1000, False) # shmem id, buffers, w, h, timeout, False=this is a client

print("Getting shmem buffers")
shmem_list = rb.getBufferListPy()

print("reading shmem buffers")
while True:
    start = time.time()
    resp = rb.clientPullFrame(index_p, rgb_meta)
    print("time to pull frame: {}".format(time.time() - start))
    if resp:
        index = intp_value(index_p)
        timestamp = rgb_meta.mstimestamp
        isize = rgb_meta.size
        data = shmem_list[index][0:isize]

output (about 500ms latency pulling frame)

time to pull frame: 0.49724626541137695
time to pull frame: 0.503028154373169
time to pull frame: 0.4986841678619385

when we change the above client to use SharedMemRingBuffer and version 0.11.0 just like in example here

rb = SharedMemRingBuffer(self.name, self.buffers, 8 * 1024 * 1024, 1000, False)  # name, cells,

output (about 3 ms latency pulling frame )

time to pull frame: 0.0029745101928710938
time to pull frame: 0.0029268264770507812
time to pull frame: 0.0030117034912109375

elsampsa commented 5 years ago

Just cross-checking:

Are your image width & height same for every case? (large images might slow things a bit .. although 0.5 secs seems excessive)

bobbych commented 5 years ago

image width and height are 1920*1080 and, same in every case

elsampsa commented 5 years ago

Thanks for the info.

My next question is a bit off-topic: why are you needing images this big? Usually for machine vision (especially for neural nets) much smaller resolution is sufficient.

I hope you are not using these big images, interpolated from YUV => RGB with SwsScaleFrameFilter (at the CPU) for displaying them on the screen (this should be done with OpenGLThread that does the interpolation at the GPU)

In any case, I'll try to look at that latency issue asap.

bobbych commented 5 years ago

it is indeed for machine vision, currently, our algorithms need higher res images. Thank you for looking into this.

elsampsa commented 5 years ago

Good news. I'm unable to reproduce your bug. I don't get any latency.

Grab from here the lessons for server & client side
Modify the server side interpolation interval to:
```
image_interval=1
```
Modify both the server & client side reso to max value:
```
width  =1920
height =1080
```
Add the timing check to the client side as per your example code.

I get "latency" of 0.02 - 0.06. Here is a typical output:

time to pull frame: 0.03588247299194336
data   :  [161 202 255 161 202 255 160 201 255 160]
width  :  1920
height :  1080
slot   :  1
time   :  1560444432847
size required :  6220800
size copied   :  6220800

If you need any further help, provide complete code samples that demo the bug. They should be very short. Drag'n'drop them to this github chat-chain.

(for more instructions, see here: https://github.com/elsampsa/valkka-core/issues/5 )

bobbych commented 5 years ago

Thanks for looking into this, I guess the only thing I am doing differently is that I don't have the fork on my filter on the server side. I am assuming fork will create a new process from live thread to scale filter so that client is not accessing frames from the same process as the live filter, I guess that makes sense. However, we realized that having client and server on the same process works as well, using asyncio, we get 30ms latency without forking into a separate process. Thanks for your help! and for awesome Library.

elsampsa commented 5 years ago

Your welcome. I can't comment your issue any further, as I'm not even sure what you mean by "fork" here. ;) System multiprocessing fork? Fork on the filterchain? It's better to draw these things with ascii-art than to explain them in literature.

Said that, great to hear that it now works. :)

elsampsa / valkka-core

Latency using SharedMemRingBufferRGB on Livefeed #4

Filtergraph