QuantumEntangledAndy / neolink

An RTSP bridge to Reolink IP cameras
GNU Affero General Public License v3.0
250 stars 41 forks source link

Memory usage 0.6.2 #157

Closed jamesahendry closed 9 months ago

jamesahendry commented 9 months ago

Can I get you to check the mem growth in the scenario where you access both the SubSteam and Mainsteam? I use this in Frigate BTW. I have some of my cams that detect on the Mainsteam only due to the field of view and I've cut the cam up into different zones within Frigate. Other cams I use substeam to detect and mainstream to record. I am seeing vastly different memory growth when I am accessing both the sub and mainstreams compared to the mainstream only (which is 4K). The 4K only memory steam is very static - very consistent on 0.6.2. The sub/main combination is consistently growing and I run reboot on the Bullseye container when it gets close to 1GB. 1 cam per container. Here's the difference yesterday. I have 3 cams running mainsteam only and 3 cams running sub/main so the consistency is absolute based on the scenarios described. There's also an auto reboot at circa 4AM due to the memory issues. I could isolcate this by spinning up a new LXC and splitting the sub/main and test if you want? In this scenario the driveway cam is mainsteam only - courtyard sub/main. As you can see the Mainstream only is static at about 240mb of memory, the cams that use the sub and mainsteam are consistantly growing.

image image

QuantumEntangledAndy commented 9 months ago

The bigest memory issue left after 0.6.2 is in gstreamer. Gstreamer has its own internal set of buffers and those increase the more streams you have. This build here (if I can get windows to build) contains some more strict limits on the gstreamer buffer that should fix this

jamesahendry commented 9 months ago

Thanks so much @QuantumEntangledAndy - I only need the Debian Bullseye Build! The LXCs have all had a apt update/upgrade run

QuantumEntangledAndy commented 9 months ago

Well it is built now so please try it. (Just needed a retry since windows will randomly fail to download the dependencies some time)

jamesahendry commented 9 months ago

All good - I'll sort this out when I get home later tonight. I'm in Melbourne / AU.

jamesahendry commented 9 months ago

Just got home and put the version linked in this thread on and memory usage is hard to comment on, however I'm not getting a clear feed. This is replicated in VLC. If you look at things like the time that should consistently count up second by second, it seems to skip frames and then produce "snow in the picture". I haven't upgraded every cam. Just Frontdoor and Courtyard.

Frontdoor is mainstream only and Courtyard is Substream and Mainstream

image

image

QuantumEntangledAndy commented 9 months ago

Hard to say for sure without me looking deeper into it. It may have something to do with the memory limits on the buffers filling and it dropping frames.

Can you estimate the bitrate you are getting I'll try to work out what size buffers we need for about 15s of stream.

QuantumEntangledAndy commented 9 months ago

P.s. my time zone is +7 so not too much difference.

jamesahendry commented 9 months ago

If I throw the stream into VLC - this is what I get for a 4K stream but it fluctuates between 2000 kb/s and 6000 and on a substeam between 500 kb/s and 700 kb/s....I would imagine this would change in daylight

image

jamesahendry commented 9 months ago

Interestingly - Here's two cameras running Sub and Main. Backgrass is on 0.6.2. Courtyard is on the last version published from https://github.com/thirtythreeforty/neolink.

I flipped over Courtyard last night before I went to bed and they both rebooot just before 5 AM. Courtyard which is on https://github.com/thirtythreeforty/neolink seems to have stablised

image image

QuantumEntangledAndy commented 9 months ago

The 4k seems really large. 6000kbs is 6mbs so 15s of that is 90mb per client the stream is currently set to 10mb per client so gstreamer can currently buffer about 1.6s of this.

Could you run with debug log on (the env variable RUST_LOG="neolink=debug") it might show us how often the stream is being kicked.

I expect some visuals aftifactes whenever a stream is restarted because there's no good way to join the end of the old stream with the beginning of the new without a re-encode (which we want to avoid because of the computational expense)

jamesahendry commented 9 months ago

No worries - it's the 4K only stream that is fine though. On 0.6.2 It's not growing in memory. It's only when I have the sub+main that I have the constant memory leak. I'll run the log and see what I can produce for you.

QuantumEntangledAndy commented 9 months ago

This is what I get with both substream and mainstream over 15min

Screenshot 2023-09-28 at 09 43 24

Seems to be fairly constant

MicheleCardamone commented 9 months ago

This is what I get with both substream and mainstream over 15min

Screenshot 2023-09-28 at 09 43 24

Seems to be fairly constant

Hi! @QuantumEntangledAndy for your memory investigation, perhaps it would be useful to know that after many hours the dreaded "reaching limit of channel" error reappeared also on 0.6.2 master. Look at the related topic, I published some information yesterday

QuantumEntangledAndy commented 9 months ago

@MicheleCardamone could you try to replicate on the build posted here? I've already added something to try to mitigate that in the release for after 0.6.2

QuantumEntangledAndy commented 9 months ago

My memory profiler though is limited to about 20mins before it has trouble. It's not just reading the ram useage but every allocation and tracking memory destruction down to the actual code level so there's a limit to how long I can profile for.

Might be able to profile for longer on a dedicated Linux box and valgrind though.

MicheleCardamone commented 9 months ago

My memory profiler though is limited to about 20mins before it has trouble. It's not just reading the ram useage but every allocation and tracking memory destruction down to the actual code level so there's a limit to how long I can profile for.

Might be able to profile for longer on a dedicated Linux box and valgrind though.

Hi! @QuantumEntangledAndy I tested the build you put here. It seems to give problems on the substream (0 fps as you see in the photo). Currently 0.6.2 master seems to be the best version, except that after many hours of use the reaching limit of channel error returns and saturates the RAM more and more. even 0.6.2 master still seems to always give the "stream not ready" message at startup imageimageimage

QuantumEntangledAndy commented 9 months ago

You'll get the "stream not ready" if you connect before the "available at " message. If you do get stream not ready the the fps will be zero (since this temp stream stops after so many frames). If your bi software dosent reconnect in this situation set use_splash to false in the config. You will eat an error message instead until the stream is ready.

jamesahendry commented 9 months ago

@MicheleCardamone what are you using there to troubleshoot the bitrates?

QuantumEntangledAndy commented 9 months ago

I'm wondering if frigate/bi is opening multiple connections (and not closing them)

If you have time could you run this build with debug log on and report on the numbers for the message Number of rtsp clients:

jamesahendry commented 9 months ago

Will do! Long weekend here so might be early next week. Thanks so much for your efforts

MicheleCardamone commented 9 months ago

what are you using there to troubleshoot the bitrates?

I didn't read the problem you have. What do you mean by bitrate problem?

ps: my problem consists only in the saturation of the ram with 0.6.2 master, and after many hours it consequently presents the "reaching limit of channel" error. I tested the build recommended here but it doesn't work in sub stream. I have not yet had the opportunity to try with “splash= false” as recommended. I think I'll try it today

MicheleCardamone commented 9 months ago

@QuantumEntangledAndy

Hi! Tried with use_splash=false and it seems to give the same problems. Unusable unfortunately 🙁

MicheleCardamone commented 9 months ago

Hi! @QuantumEntangledAndy Any news about memory? Thx Mc

QuantumEntangledAndy commented 9 months ago

Do you have a debug log for me? From the last build posted above. I want to see if frigate/Bi is creating multiple streams

jamesahendry commented 9 months ago

I am just about to load the build in the post again and send the debug logs

jamesahendry commented 9 months ago

Hi @QuantumEntangledAndy - I've loaded the build that you've referenced in this post. This is the build that gives me a lot of broken frames. I don't really mind because I've got multiple cams. Anyway - just confirming - this should be the syntax in the .toml? Should I see a debug file into neolink folder? image

image

MicheleCardamone commented 9 months ago

Do you have a debug log for me? From the last build posted above. I want to see if frigate/Bi is creating multiple streams

I don't know how to enable debugging on windows. can you help me?

QuantumEntangledAndy commented 9 months ago

Are you use CMD or powershell or docker. You just need to set an environmental variable called RUST_LOG to "neolink=debug

jamesahendry commented 9 months ago

No docker just straight command line in Debian12 which is an LXC container on Proxmox. Clearly this isn't meant to go in the .toml - sorry my ignorance - I'll sort it out when I get home tonight. Good news is that since I changed to the build linked in this discussion - memory very stable. You can see where I've changed it just before 8pm last night. I do get a lot of snow and interference on the stream

image

image

QuantumEntangledAndy commented 9 months ago

@jamesahendry RUST_LOG is not a config option but an environmental variable. You'll need to set it in the options of your proxmox server.

If you have a bash command line you run it from you can do this on the line before running neolink

export RUST_LOG="neolink=debug"
neolink .....your.options....
jamesahendry commented 9 months ago

Hi @QuantumEntangledAndy - it's run as supervisord on each Debian LXC. I've now edited the supervisor .conf file as follows and the LCX takes your export command. Where would it drop the log within the Debian OS?

image

QuantumEntangledAndy commented 9 months ago

That dosent look like or seem to be working like a normal shell. I think you should Google how to set an environmental variable on a proxmax

QuantumEntangledAndy commented 9 months ago

Looks more like your using nano to edit a config file in toml format.

MicheleCardamone commented 9 months ago

@jamesahendryRUST_LOG non è un'opzione di configurazione ma una variabile ambientale. Dovrai impostarlo nelle opzioni del tuo server proxmox.

Se disponi di una riga di comando bash da cui eseguirlo, puoi farlo sulla riga prima di eseguire neolink

export RUST_LOG="neolink=debug"
neolink .....your.options....

I tried to do this but cmd doesn't recognize this command. The latest build has some errors but as a symptom sub stream not working Screenshot 2023-10-02 172100 Screenshot 2023-10-02 172037

MicheleCardamone commented 9 months ago

@QuantumEntangledAndy Update: Googling I managed to enable (correctly I think) debugging with "set RUST_LOG=neolink=debug". I don't know what to send you because the build sent here generates lots of error messages. Trying it instead with 0.6.2 master it doesn't seem to give errors. I will keep 0.6.2 master running with debugging active until the RAM saturates to see what errors it causes. Screenshot 2023-10-02 173335 Screenshot 2023-10-02 173252

jamesahendry commented 9 months ago

@MicheleCardamone - are you running both a subStream and mainStream? I'm at the point where I'm just running a single 4K stream and the memory usage is awesome on 0.6.2. It's as soon as I introduce both streams that Neolink leaks memory. @QuantumEntangledAndy - I'll get the "set RUST_LOG=neolink=debug" sorted for you shortly and I'll install the version in this thread

6 - cams all on mainSteam only - memory usage is stable as can be. In fact on some of these charts you can see where I moved them to mainStream only at midnight last night. I haven't pasted all the charts as they all say the same thing. Frigate logs are clear.

image image image

QuantumEntangledAndy commented 9 months ago

After your neolink command can you pipe the result into a file and send that.

It should be in CMD

set RUST_LOG="neolink=debug" neolink rtsp --config=file.toml > neolink.log

You won't see a log coz it will be sent into the file but please try your connections then send me the log. (Only about 10mins is fine)

QuantumEntangledAndy commented 9 months ago

I think I have identified the source of the artifacts. It was the change to send messages on via a task so as to reduce blocking, but it causes out of order frames.

QuantumEntangledAndy commented 9 months ago

Ok so hopefully this build is good if we can't find any major issues with it. I think it would be good to release since it contains quite a few bug fixes

MicheleCardamone commented 9 months ago

Ok so hopefully this build is good if we can't find any major issues with it. I think it would be good to release since it contains quite a few bug fixes

I think I will carry out the tests today, and if necessary I will export the log. Thank you!

jamesahendry commented 9 months ago

Ok so hopefully this build is good if we can't find any major issues with it. I think it would be good to release since it contains quite a few bug fixes

I will load onto a cam tonight and test.

MicheleCardamone commented 9 months ago

Ok so hopefully this build is good if we can't find any major issues with it. I think it would be good to release since it contains quite a few bug fixes

@QuantumEntangledAndy Started this build. The problem of the SUBstream not working seems to have been corrected but unfortunately another one has reappeared: an unstable stream. The fps of both the main and the sub fluctuate a lot, making the image less fluid, which however on 0.6.2 master seems to be resolved. File config: [[cameras]] name = "cam1" username = "admin" address = "192.168.3.30:9000" stream = "both" use_splash = false

jamesahendry commented 9 months ago

I am just going to load now and report back

MicheleCardamone commented 9 months ago

@QuantumEntangledAndy If it helps, this is the log (even if I open it I can't read anything written) of the latest version you put here that gave me neolink in about 10 minutes of operation. neolink.log

QuantumEntangledAndy commented 9 months ago

You cannot read it because the file is zero bytes.. I forgot that logs are written to stderr not to stdout so the command to write stderr and stdout to file is as follows

set RUST_LOG="neolink=debug"
neolink rtsp --config=file.toml > neolink.log 2>&1

The 2>&1 means put stderr into stdout and the > neolink.log means put stdout into neolink.log

QuantumEntangledAndy commented 9 months ago

The fps of both the main and the sub fluctuate a lot, making the image less fluid, which however on 0.6.2 master

I just went through the complete diff of all changes from 0.6.2. 75% of it is just bug fixes and the new push notficiation stuff. There appears to be only three significatnt changes to the stream data.

  1. The buffer size way changed and is not calculated from the reported bitrate
  2. Frames are not put into the gstreamer pipeline until their frametime matches the runtime
  3. We timeout and reconnect a stream after 15s instead of 4

I can make a build that disables these things but are you willing to test

jamesahendry commented 9 months ago

The fps of both the main and the sub fluctuate a lot, making the image less fluid, which however on 0.6.2 master

I just went through the complete diff of all changes from 0.6.2. 75% of it is just bug fixes and the new push notficiation stuff. There appears to be only three significatnt changes to the stream data.

  1. The buffer size way changed and is not calculated from the reported bitrate
  2. Frames are not put into the gstreamer pipeline until their frametime matches the runtime
  3. We timeout and reconnect a stream after 15s instead of 4

I can make a build that disables these things but are you willing to test

Of Course - I've got 6 cams on Neolink

jamesahendry commented 9 months ago

I'm really stuggling with this latest build in Frigate - moving between stream not ready and not receiving any frames. Can't even get it going in VLC - it just sits there. Wound back to 0.6.2 and it came straight back up in Frigate and VLC

image image image

MicheleCardamone commented 9 months ago

The fps of both the main and the sub fluctuate a lot, making the image less fluid, which however on 0.6.2 master

I just went through the complete diff of all changes from 0.6.2. 75% of it is just bug fixes and the new push notficiation stuff. There appears to be only three significatnt changes to the stream data.

  1. The buffer size way changed and is not calculated from the reported bitrate
  2. Frames are not put into the gstreamer pipeline until their frametime matches the runtime
  3. We timeout and reconnect a stream after 15s instead of 4

I can make a build that disables these things but are you willing to test

sure, no problem carrying out tests. This is the log

neolink.log

MicheleCardamone commented 9 months ago

I'm really stuggling with this latest build in Frigate - moving between stream not ready and not receiving any frames. Can't even get it going in VLC - it just sits there. Wound back to 0.6.2 and it came straight back up in Frigate and VLC

image image image

I don't want to be wrong, I also had this similar problem in BI. Try putting use_splash = false