Closed morph166955 closed 3 years ago
For the record, now at the 24 hour mark the standalone OH instance which only has the Sonos binding loaded has not had a single device flap. I just installed the SamsungTV binding on that instance, waiting to see what happens.
48 hours and 2 morning OH routines that normally causes Sonos to go into communications-errors has worked perfectly both times. I would say with 99% certainty that the Samsung binding has been causing this for me. Lets see if @morph166955 has the same results with this stand alone OH instance.
I'm running this version of it.
` openhab> list -s | grep samsung
234 │ Active │ 80 │ 2.5.0.201908041228 │ org.openhab.binding.samsungtv `
Best, Jay
Hi all Just stumbled on this issue a few days back. I'm running pretty much the standard OH 2.5 release, with the Sonos and Samsung TV binding.
I disabled the Samsung TV binding today and so far didn't have any offline Sonos devices either. If everything is still running by tomorrow, I can give you an additional confirmation that it's the Samsung TV binding.
I have rolled back my Samsung TV Binding to 2.4RC1 now. I have also re-enabled all my Samsung TV Things via PaperUI.
Here's the version I grabbed - this one will NOT have as many functionalities based on the year of the TV your using. I'm connecting to the TV's on the default port=55000 via Thing definitions which newer TV's don't respond to.
282 │ Active │ 80 │ 2.4.0.RC1 │ org.openhab.binding.samsungtv
Best, Jay
Even if the code of the Samsung TV binding looks complex with the creation of multiple services, I see only UPnP actions invocations and reading of the JUPnP registry, nothing obvious that could corrupt internal JUPnP data.
Little idea: maybe the SamsungTV binding is considering UPnP Sonos devices as SamsungTV UPnP devices when going amongst all UPnP devices returned by the JUPnP library ? https://github.com/openhab/openhab-addons/blob/2.5.x/bundles/org.openhab.binding.samsungtv/src/main/java/org/openhab/binding/samsungtv/internal/handler/SamsungTvHandler.java#L226 I don't understand in this test why the UDN is not used ? It would have been more secure than the host name in the description URL. I assume everyone is setting an IP address in the host name configuration setting of their SamsungTV thing ?
Enabling DEBUG logs in the SamsungTV binding could help eliminating the case I mentioned jus above. The interesting logs to catch with the current binding version are:
One additional question: have all of you several SamsungTV things setup ?
One additional question: have all of you several SamsungTV things setup ?
I only have one Samsung TV so only one thing set up. Just realized: I uninstalled the binding, but the thing is still visible in the Paper UI because I defined it in a file.
I don't understand in this test why the UDN is not used ? It would have been more secure than the host name in the description URL.
Reason is the following (UDN is unique per UPnP service): https://github.com/openhab/openhab-addons/blob/d70f2795acb613fbd9ce347931c4eef932c67ad7/bundles/org.openhab.binding.samsungtv/src/main/java/org/openhab/binding/samsungtv/internal/handler/SamsungTvHandler.java#L208-L212
SamsungTV's should have a static Ip address or DHCP should always give a same IP to TV.
everyone is setting an IP address in the host name configuration
My original thing file looked like this; IP, Port and MAC.
Thing samsungtv:tv:livingroom "Samsung Living Room" [ hostName="192.168.0.169", port=55000, refreshInterval=15, macAddress="0c:89:XXXXX" ]
Thing samsungtv:tv:parker "Samsung Parker" [ hostName="192.168.0.142", port=55000, refreshInterval=15, macAddress="40:16:XXXXX" ]
Thing samsungtv:tv:ryan "Samsung Ryan" [ hostName="192.168.0.184", port=55000, refreshInterval=15, macAddress="40:16:XXXXX" ]
Thing samsungtv:tv:basement "Samsung Basement WiFi" [ hostName="192.168.0.180", port=8002, refreshInterval=15, macAddress="84:c0:XXXXX" ]
Thing samsungtv:tv:basementwired "Samsung Basement Wired" [ hostName="192.168.0.128", port=8002, refreshInterval=15, macAddress="9c:8c:XXXXX" ]
Thing samsungtv:tv:bedroom "Samsung Master Bedroom" [ hostName="192.168.0.183", port=55000, refreshInterval=15, macAddress="54:88:XXXXX" ]
Thing samsungtv:tv:gym "Samsung Gym" [ hostName="192.168.0.176", port=8002, refreshInterval=15, macAddress="64:1c:XXXXX" ]
Best, Jay
defined it in a file.
All I did was "disable" the Samsung TV things and my problems went away. The binding was still running but nothing to run against.
Best, Jay
This code looks strange/unusual to me too: https://github.com/openhab/openhab-addons/blob/2.5.x/bundles/org.openhab.binding.samsungtv/src/main/java/org/openhab/binding/samsungtv/internal/handler/SamsungTvHandler.java#L302 It looks like the thing configuration is updated based on a discovery result. A thing handler can edit thing properties through an API but is a binding allowed to update the configuration of its thing in its thing handler ?
I don't understand in this test why the UDN is not used ? It would have been more secure than the host name in the description URL.
Reason is the following (UDN is unique per UPnP service):
Ok, I understand.
This code looks strange/unusual to me too: https://github.com/openhab/openhab-addons/blob/2.5.x/bundles/org.openhab.binding.samsungtv/src/main/java/org/openhab/binding/samsungtv/internal/handler/SamsungTvHandler.java#L302 It looks like the thing configuration is updated based on a discovery result. A thing handler can edit thing properties through an API but is a binding allowed to update the configuration of its thing in its thing handler ?
The code is in fact setting two configuration parameters: port and protocol. My feeling is that it should be the discovery service that sets these properties.
@lolodomo, those changes seems to be done when websocket support has been added.
Websocket tokens seems to be stored also in binding configuration. Not sure what happens when thing is introduced via thing files.
I have a doubt if this is normal (and without any danger) for a thing handler to update its thing configuration, with a total bypass of the thing manager in this case.
If this is done there, I understand it is to update any thing that was created, even those not created from the inbox (so those created manually in Paper UI or coming from a configuration file).
It would have been probably cleaner to store these values (port and protocol) in the thing handler but outside the thing configuration. And the idea was to update these settings as soon as the device is discovered through UPnP and so available.
By the way, I doubt it is the origin of the problem with things going to OFFLINE.
which newer TV's don't respond to.
Here's an example of the issue with the older SamsungTV 2.4RC1 binding.
2020-02-18 17:08:13.420 [WARN ] [tv.internal.handler.SamsungTvHandler] - Channel 'samsungtv:tv:livingroom:power' not supported
2020-02-18 17:08:18.078 [WARN ] [tv.internal.handler.SamsungTvHandler] - Channel 'samsungtv:tv:livingroom:power' not supported
Best, Jay
So I'm not going to say that it's much to go on, but I did have two sonos speakers flap at the same time today on the test setup. Like normal they came back after 10 minutes. This happened almost exactly (it was one minute off) 27 hours after OH started. This is the entirety of what I have running on that test VM right now:
more /var/lib/openhab2/config/org/openhab/addons.config :org.apache.felix.configadmin.revision:=L"4" binding="sonos,samsungtv" package="expert" service.pid="org.openhab.addons" transformation="map,regex,xslt,exec,javascript,scale,xpath,jsonpath" ui="classic,basic,paper,habpanel,habmin,restdocs"
At least, you all know the workaround now: disable your SamsungTV things or uninstall this binding.
Can confirm, all my Sonos things are still online after disabling the Samsung TV binding yesterday
I can try updating the SamsungTV binding but is there at least one of you running the official 2.5.1 distribution?
is there at least one of you running the official 2.5.1 distribution?
I'm on the openHAB 2.5.0 release build and have installed the binding directly through Paper UI so that's the 2.5.1 version, yes.
Can someone enable DEBUG logs and provide the log entries I requested yesterday?
did have two sonos speakers flap
Even with the 2.4RC1 SamsungTV binding; my Sonos speakers "just once a while" flap (10 minute interval). I believe this flap is in the Sonos binding; not in the SamsungTV Binding.
The reason I believe this is because it happened a couple times with the 2.5.x SamsungTV binding when I had the SamsungTV Things disabled.
I just find it strange it takes exactly 10 minutes for it to correct itself consistently. No big deal - just happy all the Sonos devices are staying online now with 2.4RC1 Samsung binding.
Best, Jay
If the bug was in the Sonos binding, how do you explain that the binding is working perfectly except when you have the SamsungTV binding setup ?
In the Sonos binding, one reason to go OFFLINE would be because the JUPnP declares the device as not registered. To confirm that, I am still waiting for the message in front of your Sonos thing status (in Paper UI) when the thing goes OFFLINE with a communication error. To be clear, I try to confirm that it is the Sonos binding itself that sets the thing OFFLINE and if it is what is the code part that sets it to OFFLINE. If the thing is set to OFFLINE outside the binding, that would be interesting to know it.
For me, all leads to a problem with the SamsungTV binding, even more when you see the uncommon things this binding is doing (even if after a first look, I don't understand how it could corrupt JUPnP or set a Sonos thing OFFLINE).
@openhab/add-ons-maintainers : can you please confirm if there is a danger or not for a thing handler to update its thing configuration (updating the value of a configuration setting). Doing that, it bypasses the thing lifecycle. Could this lead to "corruption" in the thing registry ? PS: the Samsung TV binding is doing such a thing.
exactly 10 minutes for it to correct itself consistently
Which binding (SamsungTV, Sonos, jUpNp, io.transport.upnp, config.discovery.upnp ) has the 10 minute timer in it?
Best, Jay
For me, all leads to a problem with the SamsungTV binding, even more when you see the uncommon things this binding is doing (even if after a first look, I don't understand how it could corrupt JUPnP or set a Sonos thing OFFLINE)
Most of the Samsung TV have faulty/ragged UPnP implementation (at least older models), which could cause JUPnP crash.
The only refresh timer in the Sonos binding is set by default to 60 seconds. But anyone can update it as it is a thing configuration setting. It checks at this time if the device is registered in JUPnP.
For me, all leads to a problem with the SamsungTV binding, even more when you see the uncommon things this binding is doing (even if after a first look, I don't understand how it could corrupt JUPnP or set a Sonos thing OFFLINE)
Most of the Samsung TV have faulty/ragged UPnP implementation (at least older models), which could cause JUPnP crash.
That's one of the most probable explanation. But in this case why doesn't it happen when the SamsungTV binding is not running ?
On its side, the SamsungTV binding is refreshing every second and checks at this time if the device is registered in JUPnP . This default refresh interval looks very short.
But in this case why doesn't it happen when the SamsungTV binding is not running ?
Maybe it happens only when binding fetch data via UPnP.
If I understand correctly from @jaywiseman1971, the problem occurs only with 2.5 binding version? Which could mean that functionality or change which is introduced to 2.5 binding cause the issue to JUPnP.
Maybe it happens only when binding fetch data via UPnP.
Yes, you're right. Enabling the logs in the core IO transport UPnP bundle should allow showing what UPnP action is called in the minute beofre the Sonos things are going OFFLINE. We can also add logs in the Samsung TV binding.
problem occurs only with 2.5 binding version?
Correct, what I don't know is what version of the SamsungTV 2.5.x is when it started? Below is the screen shot of the actual binding I was using (Snapshot) around Aug 04, 2019.
Best, Jay
I understand why the refresh interval is so short. There is no subscription mechanism like in the Sonos binding.
Please enable DEBUG logs for org.eclipse.smarthome.core.io.transport.upnp and provide the logs in the last minute just before the Sonos things gone OFFLINE. It should show the UPnP actions called by any bindings in this frame and potential errors.
And you could even check more generally if there are errors when UPnP actions are invoked.
Please enable DEBUG logs
Hey @morph166955, can you help @lolodomo with this? This is the first time I've had a 100% stable OH system in over a year now. (wife is happy!) You having an isolated dev. stand alone Sonos / Samsung setup which is a great setup to gather these isolated UPnP logs.
With over 214 Things; it would not be a good thing to enable this on my production side.
Best, Jay
Maybe someone else. By the way, I would prefer someone using recent versions of core and bindings,
@openhab/add-ons-maintainers : can you please confirm if there is a danger or not for a thing handler to update its thing configuration (updating the value of a configuration setting). Doing that, it bypasses the thing lifecycle. Could this lead to "corruption" in the thing registry ? PS: the Samsung TV binding is doing such a thing.
IMO the configuration should not be altered by the thing handler and the configuration of existing things should not be altered by the discovery service (no matter if the thing was auto-discovered or manually added).
I didn’t check the binding, but: if the thing handler is able to automatically determine the correct settings, those should not be configuration parameters but thing properties and those can be altered easily and in a safe way via methods in the BaseThingHandler
. In the case of manually configured things this leads to re-setting of the properties each time the thing is created, because those properties are not stored in the database.
I can try to run the debugs later. I don't have access to the system right now.
As far as the 10 mins, it's JuPnP. I get this in debugs: 2020-02-02 13:40:34.657 [DEBUG] [org.jupnp.transport.spi.StreamClient] - Will not attempt request because it failed in the last 600 seconds: (OutgoingActionRequestMessage) POST http://REDACTED:1400/DeviceProperties/Control
Thank you @J-N-K for your answer. In fact, there are not properties but configuration settings. But when they are not set, the binding tries to identify them. That is the case for the MAC address and the kind of protocol. I think I can easily fix that. I will propose a fix in few minutes.
I can test easily if someone drops me a test binding.
Please look at PR #7036 I included a jar for testing. This PR fixes the "problem" with thing configuration updates and adds few new DEBUG logs that could help us. So please enable DEBUG logs for the binding too, at least until the problem appears.
Sorry for butting in on this but I have a quick question about what @J-N-K said...
the configuration of existing things should not be altered by the discovery service
I've always assumed the use case is where ip addresses or ports change - the discovery 'discovers' the new ip address/port, puts out a new discovery result which then updates existing thing configuration given the thing ID (so the thing goes offline, then back online with the new ipaddress/port)?
I forgot to mention this - maybe an additional piece to test.
I noticed when I did a clean startup of OH with the bad SamsungTV binding that it could take up to 24 hours for the Sonos Things to go into Communication-Error state to fail across the board. A faster way to make Sonos Things fail was to push a bunch of rule changes so the system refreshed the rules base and the Sonos Communication-Error state failed across the board much faster then.
Just something to do . . . Keep in mind I'm running OH 2.4 when I was doing this.
Best, Jay
Sorry to insist so much but can we know the detailed status of the Sonos thing when it goes OFFLINE. This is something easy to get, it is displayed by Paper UI. You just have to open your thing.
Sorry for butting in on this but I have a quick question about what @J-N-K said...
the configuration of existing things should not be altered by the discovery service
I've always assumed the use case is where ip addresses or ports change - the discovery 'discovers' the new ip address/port, puts out a new discovery result which then updates existing thing configuration given the thing ID (so the thing goes offline, then back online with the new ipaddress/port)?
By the way, this is not what is doing the Samsung TV binding. This is not the discovery result which was updated but the thiong configuration itself.
More generally, keep in mind that the things created in config files have a read-only config and their config cannoit be updated.
Of course, with my change, the rediscovery of MAC address and protocol/port will be triggered at each reinit of the thing (in case the MAC address is not set or the protocol is set to none in the thing configuration).
I loaded the jar and I get the following error:
2020-02-23 08:10:28.641 [WARN ] [tv.internal.handler.SamsungTvHandler] - Catching all exceptions because otherwise the thread would silently fail java.lang.ClassCastException: java.math.BigDecimal cannot be cast to java.lang.Integer at org.openhab.binding.samsungtv.internal.handler.SamsungTvHandler.checkCreateManualConnection(SamsungTvHandler.java:286) [bundleFile:?] at org.openhab.binding.samsungtv.internal.handler.SamsungTvHandler.checkAndCreateServices(SamsungTvHandler.java:231) [bundleFile:?] at org.openhab.binding.samsungtv.internal.handler.SamsungTvHandler.initialize(SamsungTvHandler.java:164) [bundleFile:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_242] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_242] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_242] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_242] at org.eclipse.smarthome.core.internal.common.AbstractInvocationHandler.invokeDirect(AbstractInvocationHandler.java:152) [bundleFile:?] at org.eclipse.smarthome.core.internal.common.Invocation.call(Invocation.java:52) [bundleFile:?] at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_242] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_242] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_242]
I updated the OH install to 2.5.2-1
210 │ Active │ 80 │ 2.5.2.202002192222 │ org.openhab.binding.samsungtv 226 │ Active │ 80 │ 2.5.2 │ org.jupnp 228 │ Active │ 80 │ 2.5.0 │ org.openhab.core.config.discovery.upnp 229 │ Active │ 80 │ 2.5.0 │ org.openhab.core.io.transport.upnp
This Is IMHO fully out of the original issue. You just make in evidence another issue in that binding that could be due to the use of int rather than Integer in the config class. I will fix my PR.
Refer to https://community.openhab.org/t/too-much-time-before-a-sonos-thing-becomes-definitively-online/62214
For no observed reason, JuPnP seems to fail randomly and cause things to go offline. Sonos speakers seem to be the biggest culprit. Once this condition is reached, all things that rely on JuPnP seem to fall offline quickly. I've seen this happen between 3 days and 2 weeks, no specific time length causes the failure.
This can be mitigated currently by having a rule monitor getThingStatusInfo(mything).getStatus().toString() and when it goes offline it executes "executeCommandLine(”/usr/bin/ssh -p 8101 -i /home/openhab/karaf_keys/openhab.id_rsa openhab@localhost bundle:restart org.jupnp", 120000)"