Open m1k1o opened 4 years ago
ffmpeg log:
vlc log:
I thought it was tcp, it certainly seems to be from my netstat
output
# sudo netstat -nlp | grep neolink
tcp 0 0 127.0.0.1:8554 0.0.0.0:* LISTEN 312/neolink
Are you transcoding? ffmpeg may be dropping frames because the transcoding is taking too long and the buffer maxes out. Perhaps you can try with -c:v copy
.
I'm not transcoding, using -c:v copy
. When it is TCP, as you are saying, then it's weird, that --rtsp-tcp
and -rtsp_transport tcp
causes segmentation fault.
Yes I think it's odd, I use -rtsp_transport tcp
in my ffmpeg command without issue and netstat
says its using tcp not udp. Perhaps you could check your own netstat to check for tcp or udp maybe your setup is somewhat different? Still perhaps we should wait for thirtythreeforty :)
Do you get any picture issues using the lower resolution subSteam? rtsp://<IP>/<NAME>/subStream
. (It's a new feature so you may need to compile from master)
@QuantumEntangledAndy it shows TCP. In that case, ffmpeg must use TCP by default. This is weird enough, then I have no idea what can cause this behavior.
subStream
is lagging much more than mainStream
, what is, again, weird. I suspect insufficient power to my camera could cause these encoding lags.
Ffmpegs default is UDP first then after timeout TCP. By adding rtsp_transport tcp
you are just skipping the UDP step and going straight for TCP.
An even slower subStream is very odd, what is your ffmpeg command? Could you try something simple that pipes to /dev/null
so we can exclude file write speed. Like this:
ffmpeg -i rtsp://... -v:c copy -an -f mpegts /dev/null
When I'm recording 24/7 and splitting to segments, those frame corruptions are not so frequent, but not gone. Since my camera is connected directly and no transcoding is taking place, this is not acceptable.
ffmpeg -i rtsp://... -c:v copy -f segment -segment_wrap 24 -strftime 1 -segment_time 1800 -segment_format mp4 "/var/dvr/%Y-%m-%d_%H-%M-%S.mp4"
Using VLC just watching on PC (command I posted above) gives frame corruptions every 5-10sec.
Adding camera to Motion project is absolutely not working, I cannot see single frame from camera. I had to explicitly turn off TCP, so neolink would not crash everytime it tires to connect. netcam_use_tcp off
Yes I can understand how this would not help with motion detection. Just thinking about possible sources since we think it's TCP.
I'm curious as to why you've gotten 807 duplicated frames in ffmpeg too.
Your encoding speed is 0.622 which means that it is not keeping up with the input (should be 1.0 if you can keep up with the live input) and that it is why get max delay.
I'll have a look for corruptions on my little E1 camera when I get home using your command line.
So ssh-ing in while on the bus home and trying your command.
With subStream all is well and no artifacts in the camera or errors in the ffmpeg log,
With mainStream I start to get errors coming from the reaching maxbuffer and then artifacts in the output.
Oh more progress, I thought about the TCP/UDP thing I thought the server may listen on TCP but during the streaming it may be UDP. So thought I would do netstat during a stream record (with your options and I got)
netstat -nlp | grep neolink
tcp 0 0 127.0.0.1:8554 0.0.0.0:* LISTEN 29413/./neolink
udp 704 0 0.0.0.0:43158 0.0.0.0:* 29413/./neolink
udp 0 0 0.0.0.0:43159 0.0.0.0:* 29413/./neolink
So next I added -rtsp_transport tcp
to your command and in this case neolink used tcp
only it seems that gstreamer (which is used to hande the rstp in neolink) supplies either udp or tcp depending on what you ask for
Also when I used -rtsp_transport tcp
I get no buffer loss either and no artifacts in the output.... Soooooooooo it seems we are using TCP or UDP but that most clients default to UDP first and that gets you your issue.
Can you add -rtsp_transport tcp
? Do you get the segfault still? If so this may be something upstream in your gstreamer lib perhaps an old version of gstreamer?
For reference my versions is
libgstrtspserver-1.0-0 (= 1.14.4-1)
If your not compiling from source it may be worth trying that too
Adding -rtsp_transport tcp
causes immediate segmentation fault. I'm compiling from master using Dockerfile.
Gst versions seems to be up-to-date:
How very odd. TCP works on my E1 but not on your 800.
Perhaps it's the 4K over the TCP but I don't know enough on this to be sure.
Should probably wait for thirtythreeforty. Sorry I couldn't help more. But at least we know that neolink has TCP. So it seems to be an issue with why your getting segfault.
Anyway, thanks for your inputs and feedback. So we definitley found out, that TCP is present. Maybe it's special case of my B800 with 4K. Or maybe only h265 with weird encoding.
Found this bug fix that should addresses simair problem, where some frames are mistakenly dropped. Since its from late 2018, it should be alredy present in used ffmpeg.
Did you ever try TCP and subStream? I'm wondering if there's some sort of limit on servers buffer size for TCP.
Good idea, I tried that now and surprisingly, no segfault! With TCP I can confirm, no packet loss is taking place and recoding works as expected with 7fps (can be adjusted in camera settings). Live streaming using VLC, however, assumes that stream is 29fps and while palying at this rate, it comes soon out of buffer and stops.
So, it seems like its just overflowing some fixed buffer or etc.
Could you also try running the subStream through ffprobe and telling us if it's h265 or h264? We have some cameras that are 265 on the main and 264 on the sub so that may also be an issue.
This is exactly my case.
mainStream
Input #0, rtsp, from 'rtsp://neolink:8554/cam/mainStream':
Metadata:
title : Session streamed with GStreamer
comment : rtsp-server
Duration: N/A, start: 0.127133, bitrate: N/A
Stream #0:0: Video: hevc (Main), yuvj420p(pc, bt709), 3840x2160, 20 fps, 20 tbr, 90k tbn, 20 tbc
subStream
Input #0, rtsp, from 'rtsp://neolink:8554/cam/subStream':
Metadata:
title : Session streamed with GStreamer
comment : rtsp-server
Duration: N/A, start: 0.244389, bitrate: N/A
Stream #0:0: Video: h264 (High), yuv420p(progressive), 640x352, 90k tbr, 90k tbn, 180k tbc
In subStream
is clearly missing fps.
You might be able to get the FPS by increasing the probe size. -probesize 10M
Hmmm this may not work as I hope but could you add this option to your [[camera]]
config and then try tcp on the main stream (sub will probably not working while using this option)
format = "! h265parse ! queue leaky=true ! rtph265pay name=pay0"
It will add a leaky queue that will drop data when you reach over 10485760 bytes
I think thirtythreeforty uses a B800 too, so perhaps he could try TCP and see if it segfaults when he reads this...
Not even setting -probesize 10M
or 30M got me any fps. Your given format does not help, still getting segfault.
I see sorry the queue didn't help. I'm also wondering if this also fails outside of docker. There may be a limit on the amount of memory a thread can consume to protect against memleaks active by default. Not sure though as I'm not a docker expert.
Just saw this. I have had no issues streaming the H.265 main stream via RTSP-over-TCP in the past using exactly your setup.
Since Neolink is actually crashing, feels like Gstreamer has a bug; safe Rust is very hard to segfault. That said, Alpine has all the latest Gstreamer versions. Let me try when I get back to my workstation.
It may not be showing a FPS as it dosent strictly speaking have one. Reolink seems to just be forwarding H264 frames without timestamps. I think (but not sure) the app just shows the frame it last saw. We add a timestamp as it arrives so that it behaves a bit better in ffmpeg and vlc but it's not at a specific rate. although ffmpeg seems to calculate a variable FPS based on the timestamps, I'm surprised VLC isn't doing that.
Hmmm in your docker image you could run ulimit -a as the same user that runs neolink to see if there's any mem limits set up. Unless you run it as root in which case no need to check as there's no limit.
Memory size does not seem to be limited.
You have a stacksize of 8192kb. If it tries to allocate to an array bigger that this it may segfault.
It all depends on how it's progmmed in gstreamer. If it's a single allocation on the stack or if it's done on the heap. Like this explains
You could test bumping it up with ulimit -s 65532
but whether you can all depends on your permissions (also this setting will be lost on shell change)
Yes, you guessed correctly! I ran it through valgrind now and this is what I came across:
==18== Can't extend stack to 0x7cd1fe8 during signal delivery for thread 18:
==18== too small or bad protection modes
==18==
==18== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==18== Access not within mapped region at address 0x7CD1FE8
==18== at 0x5045CFF: gst_rtsp_watch_send_messages (in /usr/lib/libgstrtsp-1.0.so.0.1602.0)
==18== If you believe this happened as a result of a stack
==18== overflow in your program's main thread (unlikely but
==18== possible), you can try to increase the size of the
==18== main thread stack using the --main-stacksize= flag.
==18== The main thread stack size used in this run was 8388608.
==18== Thread 18 queue6:src:
==18== Invalid write of size 8
==18== at 0x489713F: _vgnU_freeres (vg_preloaded.c:83)
==18== Address 0x7cd2e80 is on thread 18's stack
==18==
==18==
==18== Process terminating with default action of signal 11 (SIGSEGV)
==18== Bad permissions for mapped region at address 0x7CD2E80
==18== at 0x489713F: _vgnU_freeres (vg_preloaded.c:83)
==18==
==18== HEAP SUMMARY:
==18== in use at exit: 4,339,957 bytes in 23,735 blocks
==18== total heap usage: 103,815 allocs, 80,080 frees, 157,144,079 bytes allocated
==18==
==18== LEAK SUMMARY:
==18== definitely lost: 16,384 bytes in 1 blocks
==18== indirectly lost: 0 bytes in 0 blocks
==18== possibly lost: 5,446 bytes in 68 blocks
==18== still reachable: 4,121,391 bytes in 22,794 blocks
==18== of which reachable via heuristic:
==18== length64 : 792 bytes in 18 blocks
==18== newarray : 1,664 bytes in 24 blocks
==18== suppressed: 0 bytes in 0 blocks
==18== Rerun with --leak-check=full to see details of leaked memory
==18==
==18== For lists of detected and suppressed errors, rerun with: -s
==18== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
After setting unlimited stack size in docker-compose using this:
ulimits:
stack: -1
I got different error:
==10==
==10== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==10== Bad permissions for mapped region at address 0x56BC7D0
==10== at 0x5042784: ??? (in /usr/lib/libgstrtsp-1.0.so.0.1602.0)
==10==
==10== HEAP SUMMARY:
==10== in use at exit: 4,302,518 bytes in 22,864 blocks
==10== total heap usage: 140,521 allocs, 117,657 frees, 381,035,279 bytes allocated
==10==
==10== LEAK SUMMARY:
==10== definitely lost: 16,384 bytes in 1 blocks
==10== indirectly lost: 0 bytes in 0 blocks
==10== possibly lost: 5,446 bytes in 68 blocks
==10== still reachable: 4,083,952 bytes in 21,923 blocks
==10== of which reachable via heuristic:
==10== length64 : 792 bytes in 18 blocks
==10== newarray : 1,664 bytes in 24 blocks
==10== suppressed: 0 bytes in 0 blocks
==10== Rerun with --leak-check=full to see details of leaked memory
==10==
==10== Use --track-origins=yes to see where uninitialised values come from
==10== For lists of detected and suppressed errors, rerun with: -s
==10== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
Nicely spotted @QuantumEntangledAndy. For the new error, Bad permissions for mapped region
means "modifying a const
." It's possible that Neolink is passing a const which should not be const. Is it happy if you run it without Valgrind?
Nope, Segmentation fault (core dumped).
Detailed trace from valgrind:
==10== Thread 11 queue2:src:
==10== Invalid write of size 8
==10== at 0x5042784: ??? (in /usr/lib/libgstrtsp-1.0.so.0.1602.0)
==10== by 0x497254F: ???
==10== by 0xD3E6284C00000000: ???
==10== by 0x49725DF: ???
==10== by 0xB: ???
==10== by 0x12: ???
==10== by 0xB6E2AAAC0BB73603: ???
==10== by 0x829792024BE333DF: ???
==10== by 0xB6D78A908A81B36B: ???
==10== by 0xCBE0B3E9610555F4: ???
==10== by 0xCE3759897C8E5725: ???
==10== by 0xAE239180C7770BB5: ???
==10== Address 0x7fc5570 is on thread 11's stack
==10==
==10== Invalid write of size 8
==10== at 0x5042792: ??? (in /usr/lib/libgstrtsp-1.0.so.0.1602.0)
==10== by 0x497254F: ???
==10== by 0xD3E6284C00000000: ???
==10== by 0x49725DF: ???
==10== by 0xB: ???
==10== by 0x12: ???
==10== by 0xB6E2AAAC0BB73603: ???
==10== by 0x829792024BE333DF: ???
==10== by 0xB6D78A908A81B36B: ???
==10== by 0xCBE0B3E9610555F4: ???
==10== by 0xCE3759897C8E5725: ???
==10== by 0xAE239180C7770BB5: ???
==10== Address 0x7fc5578 is on thread 11's stack
==10==
==10== Invalid write of size 8
==10== at 0x5042877: ??? (in /usr/lib/libgstrtsp-1.0.so.0.1602.0)
==10== by 0x497254F: ???
==10== by 0xD3E6284C00000000: ???
==10== by 0x49725DF: ???
==10== by 0xB: ???
==10== by 0x12: ???
==10== by 0xB6E2AAAC0BB73603: ???
==10== by 0x829792024BE333DF: ???
==10== by 0xB6D78A908A81B36B: ???
==10== by 0xCBE0B3E9610555F4: ???
==10== by 0xCE3759897C8E5725: ???
==10== by 0xAE239180C7770BB5: ???
==10== Address 0x7fc5580 is on thread 11's stack
==10==
==10== Invalid write of size 8
==10== at 0x5042885: ??? (in /usr/lib/libgstrtsp-1.0.so.0.1602.0)
==10== by 0x497254F: ???
==10== by 0xD3E6284C00000000: ???
==10== by 0x49725DF: ???
==10== by 0xB: ???
==10== by 0x12: ???
==10== by 0xB6E2AAAC0BB73603: ???
==10== by 0x829792024BE333DF: ???
==10== by 0xB6D78A908A81B36B: ???
==10== by 0xCBE0B3E9610555F4: ???
==10== by 0xCE3759897C8E5725: ???
==10== by 0xAE239180C7770BB5: ???
==10== Address 0x7fc5588 is on thread 11's stack
==10==
==10==
==10== Process terminating with default action of signal 11 (SIGSEGV): dumping core
==10== Bad permissions for mapped region at address 0x7FA4028
==10== at 0x4D71D48: gst_memory_map (in /usr/lib/libgstreamer-1.0.so.0.1602.0)
==10== by 0x504284F: ??? (in /usr/lib/libgstrtsp-1.0.so.0.1602.0)
==10== by 0x497254F: ???
==10== by 0xD3E6284C00000000: ???
==10== by 0x49725DF: ???
==10== by 0xB: ???
==10== by 0x12: ???
==10== by 0xB6E2AAAC0BB73603: ???
==10== by 0x829792024BE333DF: ???
==10== by 0xB6D78A908A81B36B: ???
==10== by 0xCBE0B3E9610555F4: ???
==10== by 0xCE3759897C8E5725: ???
==10==
==10== HEAP SUMMARY:
==10== in use at exit: 6,644,756 bytes in 23,723 blocks
==10== total heap usage: 61,289 allocs, 37,566 frees, 98,423,051 bytes allocated
==10==
==10== LEAK SUMMARY:
==10== definitely lost: 16,384 bytes in 1 blocks
==10== indirectly lost: 0 bytes in 0 blocks
==10== possibly lost: 5,438 bytes in 68 blocks
==10== still reachable: 6,419,798 bytes in 22,773 blocks
==10== of which reachable via heuristic:
==10== length64 : 712 bytes in 16 blocks
==10== newarray : 1,664 bytes in 24 blocks
==10== suppressed: 0 bytes in 0 blocks
==10== Rerun with --leak-check=full to see details of leaked memory
==10==
==10== For lists of detected and suppressed errors, rerun with: -s
==10== ERROR SUMMARY: 125 errors from 7 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)
I'm just thinking out loud here but. Perhaps we should limit the max amount we push into the appsrc? Rather than pushing it all at once with app_src.push_buffer(gst_buf);' We split
gst_buf` into chunks of fixed size and push those in a loop.
I'm not sure if gstreamer expects a whole frame per input though.
Although the BC format is in 40kb chucks. If neolink just passes that than it's not here. Hmmm why is gstreamer failing with large frames surely it can handle large 4k video.
Hmmm push_buffer
should take ownership of the buffer and that gets sent to gstreamers c functions. Perhaps this isnt being moved or released correctly. Sorry just reading through the docs and other issues people had with gstreamer and segfault.
It might be helpful to have the log with GST_DEBUG=9
set
So I tried to set my stack limit to smaller sizes to try and replicate your issue. I changed it right down to 64KB before getting issues (any lower and the rust neolink program not the c gstreamer would crash out in a different place to your error). I know my E1 isn't 4K but surly I would hit a problem before this point... hmmm
I have a similar issue with the D800. I put it down to the client I was using at the time. I have tried with VLC and get the same issues. Although with the 'substream' it runs fine? So that would make sense that the client isn't set up to stream 4k properly? The set up runs perfectly fine in Blue Iris through and Reolink Client natively. Current Client I am using is the Synology surveillance station which doesn't have a buffer option so the stream drops frames.
Not sure what has changed? But after updating the docker image. Everything is appearing smooth with no issues. Thanks!
Not for me. :(
@rygrass are you running it in docker or on host? What it your configuration?
@m1k1o. Just running the latest version they have on the docker hub page? Also running it on Docker ( Which is on my Synology Nas DS1515+ )
I have set up an auto-restart though as mine does crash every now and again ( every 3-5 days )
I have noticed though. If I run two streams at the same time... it crashes out. ( If I run Synology Client and load up VLC )
Otherwise been smooth sailing for a few days so far.
[2020-08-12T09:09:20Z INFO neolink] Neolink 0.3.0 (unknown commit) release [2020-08-12T09:09:28Z INFO neolink] Leftfront: Connecting to camera at 192.168.0.91:9000 [2020-08-12T09:09:28Z INFO neolink] Backdoor: Connecting to camera at 192.168.0.93:9000 [2020-08-12T09:09:28Z INFO neolink] Garage: Connecting to camera at 192.168.0.94:9000 [2020-08-12T09:09:28Z INFO neolink] Leftfront: Connecting to camera at 192.168.0.91:9000 [2020-08-12T09:09:28Z INFO neolink] Garage: Connecting to camera at 192.168.0.94:9000 [2020-08-12T09:09:28Z INFO neolink] FrontDoor: Connecting to camera at 192.168.0.92:9000 [2020-08-12T09:09:28Z INFO neolink] FrontDoor: Connecting to camera at 192.168.0.92:9000 [2020-08-12T09:09:28Z INFO neolink] Backdoor: Connecting to camera at 192.168.0.93:9000 [2020-08-12T09:09:28Z INFO neolink] Garage: Connected and logged in [2020-08-12T09:09:28Z INFO neolink] Garage: Connected and logged in [2020-08-12T09:09:28Z INFO neolink] FrontDoor: Connected and logged in [2020-08-12T09:09:28Z INFO neolink] FrontDoor: Connected and logged in [2020-08-12T09:09:28Z INFO neolink] Backdoor: Connected and logged in [2020-08-12T09:09:28Z INFO neolink] Backdoor: Connected and logged in [2020-08-12T09:09:28Z INFO neolink] Backdoor: Camera time is already set: 2020-08-12 17:09:04 +8 [2020-08-12T09:09:28Z INFO neolink] Backdoor: Starting video stream subStream [2020-08-12T09:09:28Z INFO neolink] Leftfront: Connected and logged in [2020-08-12T09:09:28Z INFO neolink] Leftfront: Connected and logged in [2020-08-12T09:09:28Z INFO neolink] Leftfront: Camera time is already set: 2020-08-12 17:09:03 +8 [2020-08-12T09:09:28Z INFO neolink] Leftfront: Starting video stream mainStream [2020-08-12T09:09:28Z INFO neolink] Leftfront: Camera time is already set: 2020-08-12 17:09:03 +8 [2020-08-12T09:09:28Z INFO neolink] Leftfront: Starting video stream subStream [2020-08-12T09:09:28Z INFO neolink] Garage: Camera time is already set: 2020-08-12 17:09:03 +8 [2020-08-12T09:09:28Z INFO neolink] Garage: Starting video stream mainStream [2020-08-12T09:09:28Z INFO neolink] Garage: Camera time is already set: 2020-08-12 17:09:03 +8 [2020-08-12T09:09:28Z INFO neolink] Garage: Starting video stream subStream [2020-08-12T09:09:28Z INFO neolink] Backdoor: Camera time is already set: 2020-08-12 17:09:04 +8 [2020-08-12T09:09:28Z INFO neolink] Backdoor: Starting video stream mainStream [2020-08-12T09:09:28Z INFO neolink] FrontDoor: Camera time is already set: 2020-08-12 17:09:03 +8 [2020-08-12T09:09:28Z INFO neolink] FrontDoor: Starting video stream mainStream [2020-08-12T09:09:28Z INFO neolink] FrontDoor: Camera time is already set: 2020-08-12 17:09:03 +8 [2020-08-12T09:09:28Z INFO neolink] FrontDoor: Starting video stream subStream
[[cameras]] name = "Leftfront" username = "" password = "" address = "192.168.0.91:9000" stream = "both"
[[cameras]] name = "FrontDoor" username = "" password = "" address = "192.168.0.92:9000" stream = "both"
[[cameras]] name = "Backdoor" username = "" password = "" address = "192.168.0.93:9000" stream = "both"
[[cameras]] name = "Garage" username = "" password = "" address = "192.168.0.94:9000" stream = "both"
I see the grey frames too. D800
Only on mainStream, not subStream.
Tested with VLC on Linux mint 20.
Same with "IPCams" app on iPhone, which is more grey than not. Dunno if this is tcp/udp or other issue.
@osos We think gray creena are packet loss during UDP. You can usually improve them by increasing the UDP buffer size (in your client). Could you try using TCP connection (most clients default to UDP). We have a potential issue at the moment though where TCP inside docker causes it to crash. We are not sure why.
tcp seems better, however not all clients have options to chose transport or buffers.
On a related note other project suffer the same with udp streams, pointers in direction of ffmpeg / gstreamer issues: https://github.com/aler9/rtsp-simple-server/issues/38
There is nothing we can do about the client not having TCP or buffer options. The issue comes from the size of the data 4K video needs larger buffers than the lower resolution streams. Do you have a specific client in mind?
Switching to TCP (e.g.
vlc --rtsp-tcp
) causes segmentation fault.Using only UDP causes packet loss and damaged frames. Even when my B800 camera is directly connected to Ethernet port of my PC. I often see such Image. Since my picture is in 4K, it causes huge traffic. Maybe other models with HD stream only are working good.
I was able to mitigate flickering by adjusting some parameters in VLC:
But it's not definitley gone. At night, when image switches to only grayscale, everything works good, so it must be bandwdth problem. I guess, TCP would solve this problem, or am I something missig, how could I solve this problem?