bitfocus / companion-module-ptzoptics-visca

MIT License
8 stars 11 forks source link

BUG : Module going non responsive, requires restart of module. Memory leak? #20

Open BigJinge opened 2 years ago

BigJinge commented 2 years ago

Hi,

Not sure if Håkon Nessjøen still maintains the module but I'll post here in case anyone else has this issue.

Setup: Windows 10 21H1 Streamdeck XL running latest firmware 5.2.1 Companion 2.1.4 Stable 2x PTZOptics class cameras 30x and 20x using VISCA OBS

Issue: Companion buttons that are configured to trigger camera presets become unresponsive after a varible period of time.

Trigger: It appears that if the cameras are turned on after the PC running Companion, that the module, whilst constantly listening for the camera to respond, is using up some memory and not releasing it back to the module. That once the resources for a specific module reach a certain level, the module stops responding.

Tests: We've seen this happen individually with both cameras. We started both cameras then the PC and the module didn't hang. Started Camera 1 and not Camera 2, then the PC, then Camera 2. The module of the Camera 2 hung around 45 mins later. Ditto when starting Camera 2 first and not Camera 1.

The cameras are fine. We can control them with other third party devices over VISCA whilst the Companion module has hung. This is not ideal as the whole purpose of the Streamdeck was having these preset buttons working on it.

Temporary Workaround: Going into the Companion Connections screen then disabling / re-enabling the hung PTZOptics VISCA module. Companion buttons using the module then come back on line. The workaround isn't a solution given you're in the middle of a live performance and having to leave the video mixer to restart the Companion module.

Additional info. We upgraded the system to Companion 2.2 which didn't solve the issue. Given the latest v.1.1.6 module release date of 5th April 2021, it would be included in Companion 2.1.3 / 2.1.4 and 2.2

Severity 1. If there is a memory leak in the module, we don't want to be on tenterhooks not knowing if the module is going to hang esp during a long live service.

Thanks

Gartom commented 2 years ago

This is a very interesting observation.

We are using two PTZOptics cameras for live streaming every Sunday from our church since ~1.5 years and have seen this once in a while, but really not often. As it has been so seldom, we haven't looked into trying to find a reason as the work-around to restart the Companion instance of the module works. I also nowadays have the routine to always start up both cameras before booting the PC that runs Companion and I haven't seen this issue since I started to do that.

Before a solution has been found for the issue, I would recommend to dedicate a Stream Deck button for disabling and enabling the PTZOptics module, using the "internal: Enable or disable instance" function. I have made the corresponding function available by long-pressing the "home/page" button on our control page for the ATEM TVS HD video mixer, as we sometimes have had the same issue with that module, but not consistently enough to go into troubleshooting.

BigJinge commented 2 years ago

It's good to know that it isn't only happening to me and I'm not going mad...

The longest service we've had I think, would be carol concert around 1hr 45mins.

A thought then for a test. I turn on the cameras first then the PC running Companion. Leave them recording for a few hours in OBS whilst I do other things, then see if the Companion buttons have hung by then.

It would be nice if any resource leak would be fixed as it's just one less thing to worry about.

Re the restart of the module(s). I'll add this as a restart button.

Restart

Thanks

Gartom commented 2 years ago

This issue may be related: https://github.com/bitfocus/companion-module-bmd-atem/issues/168

EnergeticStick commented 1 year ago

I have this same issue of Companion going unresponsive after being idle for just a few hours. We use it 3 times a week and leave everything running 24/7.

Rittman405 commented 1 year ago

Having the same problem with one of two PTZ optics camera connections. Thanks for the idea to create a button to disable and enable the instance.

BigJinge commented 1 year ago

Having the same problem with one of two PTZ optics camera connections. Thanks for the idea to create a button to disable and enable the instance.

Glad the button idea helped. I probably use it once every few weeks when the second PTZ camera is plugged in after the server (running Companion) has been turned on for a while.

cyberblaststudios commented 11 months ago

You could give it a try on Companion v3 stable and see if it still occurs.

jswalden commented 2 months ago

I run Companion an hour-plus every Sunday (on a computer I suspend/resume each time, without closing and reopening Companion when I do so) and haven't seen anything like this in the last half-year, with the module code during that time. (And, post-v3 conversion, in case the issue was Companion-side rather than in the module.) Not saying this isn't somehow real, but it's not something I can reproduce.

And knowing the module code, there's not really any long-running code in the module. The connection to the camera just sort of sits there without spawning ongoing work. (Except TCP keepalive, but if that's responsible it'd be an extremely small leak in Node and would affect way more than just Companion, so that's highly unlikely.) The module doesn't do anything except through action callbacks.

If I had to guess at anything responsible, it'd be some kind of leak in Companion code, or in companion-module-base that this module depends upon. Maybe something on the Companion side related to IPC to the module process, or on the module process side related to IPC to Companion. But that's spitballing as long as I can't actually reproduce the problem.

If anyone can trigger this with an actually-recent build of this module, I'm happy to investigate. But for now with no way to trigger it for investigation, I'm going to close this.

BigJinge commented 2 months ago

Dear Jeff,

The situation still exists in Companion 3.x

We had a Synod last weekend which lasted all day and the module had to be restarted a couple of times during it.

Whilst I’m glad you can’t replicate it with your setup, this is still an issue. You’re very welcome to liase with me for logs to see where the issue lays.

This issue isn’t closed.

From: Jeff Walden @.> Sent: Friday, May 3, 2024 5:55 AM To: bitfocus/companion-module-ptzoptics-visca @.> Cc: BigJinge @.>; Author @.> Subject: Re: [bitfocus/companion-module-ptzoptics-visca] BUG : Module going non responsive, requires restart of module. Memory leak? (Issue #20)

Closed #20 https://github.com/bitfocus/companion-module-ptzoptics-visca/issues/20 as not planned.

— Reply to this email directly, view it on GitHub https://github.com/bitfocus/companion-module-ptzoptics-visca/issues/20#event-12691725133 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEXI5S3IUCTHIMS6RCQTGTTZAMKBVAVCNFSM5UGSSJAKU5DIOJSWCZC7NNSXTWQAEJEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW4OZRGI3DSMJXGI2TCMZT . You are receiving this because you authored the thread. https://github.com/notifications/beacon/AEXI5S5A42ODR2MYVLFISJTZAMKBVA5CNFSM5UGSSJAKYY3PNVWWK3TUL52HS4DFWZEXG43VMVCXMZLOORHG65DJMZUWGYLUNFXW5KTDN5WW2ZLOORPWSZGPAAAAAAXUPRNU2.gif Message ID: @. @.> >

tonypiper commented 2 months ago

I've run into this occasionally, though it seems to happen within the first 10-15 mins of powering everything up, not after a long period of running, and we've then gone on to run some streams for 6 hours with no loss of control.

I've not been able to narrow it down either, but perhaps, given it happens so soon for me, it's not a memory leak but something in the network stack.

For me, the module stays responsive - I don't see red dots on the preset recalls

CleanShot 2024-05-03 at 18 39 41

but the cameras fail to respond to preset recall or ptz commands. Perhaps the module thinks it's connected, but for whatever reason the commands aren't sent, or received.

It's frustrating, but I also have an easily-accessible button that disables and re-enables the camera connections and that seems to do the trick. Perhaps the next debugging step is to run a packet capture on the PC and see if the commands are being sent. I've not felt the urge to, as of yet.

jswalden commented 2 months ago

Packet capturing sounds like a good place to start, yes. Wireshark filtering for traffic to/from the specific TCP port 5678 (assuming you haven't changed it) should be able to run pretty much indefinitely, given how minimal in size VISCA traffic is.

I'm happy to look at any captures either of you can run -- and yes, Companion logs with timestamps in them to correlate against would help as well. Like I said, get me some info and we can actually do something. :-)

BigJinge commented 2 months ago

Thanks for reopening the case.

When I mention module below, I refer to the ptzoptics VISCA module.

To comment on something tonypiper said, that for me also, I don’t get a red dot on the modules’ Streamdeck buttons when a module has hung. They just don’t do anything. It appears it can “hear” the cameras, hence why Wireshark wasn’t used already.

From tests I’ve done previously, the issue can be accelerated if a camera that is specified within the module, isn’t “there”. We have both a fixed wall PTZ and a mobile PTZ. The mobile PTZ sometimes doesn’t get connected to the network maybe twenty minutes after the main one. The mobile PTZ module would still be hunting for the camera until the camera is on.

That if one instance of the module has hung, it doesn’t appear to hang ALL the instances of modules, which pointed to the resources allocated to an individual module.

With the Synod last week, I had to use the module reset button maybe an hour in, as the mobile PTZ module had hung. Both cameras were active from that moment and I still had to use the module reset button again maybe a few hours after that.

As a test, you could add a “phantom” camera module, for an IP address which isn’t being used. This constant checking for the camera is the scenario that causes the module to hang, even if it isn’t a memory leak in the module itself.

Yes, we do have a workaround reset button, but it’s not elegant to use when someone decides to move outside of the camera view and you can’t use one of the hung module preset buttons, and have a few seconds as the modules reset.

I’m sure between us all, we will narrow down what is happening.

From: Jeff Walden @.> Sent: Friday, May 3, 2024 9:45 PM To: bitfocus/companion-module-ptzoptics-visca @.> Cc: BigJinge @.>; Author @.> Subject: Re: [bitfocus/companion-module-ptzoptics-visca] BUG : Module going non responsive, requires restart of module. Memory leak? (Issue #20)

Packet capturing sounds like a good place to start, yes. Wireshark filtering for traffic to/from the specific TCP port 5678 (assuming you haven't changed it) should be able to run pretty much indefinitely, given how minimal in size VISCA traffic is.

I'm happy to look at any captures either of you can run -- and yes, Companion logs with timestamps in them to correlate against would help as well. Like I said, get me some info and we can actually do something. :-)

— Reply to this email directly, view it on GitHub https://github.com/bitfocus/companion-module-ptzoptics-visca/issues/20#issuecomment-2093737558 , or unsubscribe https://github.com/notifications/unsubscribe-auth/AEXI5S4CWS3IEM62DDZK2C3ZAPZNBAVCNFSM5UGSSJAKU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBZGM3TGNZVGU4A . You are receiving this because you authored the thread. https://github.com/notifications/beacon/AEXI5S7LQHFN2PX5IL2T6LTZAPZNBA5CNFSM5UGSSJAKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOPTF6MVQ.gif Message ID: @. @.> >