Closed david-pace closed 9 months ago
Here is some additional debug information:
I enhanced the logging to print the IP address (which is always the same in the following log entries, an IP of a printer), the thread's toString()
representation, the thread ID and the exception. Here are the relevant log entries grouped by concurrency (log entries of different threads happening in the same second are grouped together):
16:28:00.754 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:28:00.754 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:28:00.754 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:28:00.757 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:28:00.758 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:28:00.757 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:28:00.760 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:28:00.760 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:28:00.760 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:28:22.681 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:28:22.681 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:28:22.681 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[OH-discovery-2,5,main] (thread ID 93) failed with exception java.net.ConnectException: Connection refused
16:28:22.681 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:28:22.681 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[OH-discovery-1,5,main] (thread ID 92) failed with exception java.net.ConnectException: Connection refused
16:28:22.686 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[OH-discovery-2,5,main] (thread ID 93) failed with exception java.net.ConnectException: Connection refused
16:28:22.686 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[OH-discovery-1,5,main] (thread ID 92) failed with exception java.net.ConnectException: Connection refused
16:28:22.691 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[OH-discovery-1,5,main] (thread ID 92) failed with exception java.net.ConnectException: Connection refused
16:28:22.691 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[OH-discovery-2,5,main] (thread ID 93) failed with exception java.net.ConnectException: Connection refused
16:29:08.864 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.SocketTimeoutException: Connect Timeout
16:29:08.864 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.SocketTimeoutException: Connect Timeout
16:29:08.864 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.SocketTimeoutException: Connect Timeout
16:29:52.350 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:29:52.350 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:29:52.350 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:29:52.587 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:29:52.676 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:29:52.676 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:30:02.349 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:30:02.349 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:30:02.349 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:30:07.597 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:30:07.597 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:30:07.598 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:30:13.358 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:30:13.358 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:30:13.358 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:30:58.889 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:30:58.889 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:30:58.889 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:31:31.391 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:31:31.391 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:31:31.391 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:31:32.178 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:31:32.178 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
16:31:32.178 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:31:42.396 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-6-thread-1,5,main] (thread ID 114) failed with exception java.net.ConnectException: Connection refused
16:31:42.396 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-8-thread-1,5,main] (thread ID 115) failed with exception java.net.ConnectException: Connection refused
16:31:42.396 [DEBUG] [discovery.BridgeDiscoveryParticipant] - Discovery for IP xxx.xxx.xxx.48 in thread Thread[JmDNS pool-7-thread-1,5,main] (thread ID 116) failed with exception java.net.ConnectException: Connection refused
Here we can gather some interesting information:
toString()
and thread IDs is mostly 3JmDNS pool-
and OH-discovery-
). Here we see three different threads with the JmDNS pool-
prefix and two different threads with the prefix OH-discovery
). That's 5 unique threads in this block.@jlaur do you know who implemented the discovery infrastructure and could help us to explain some of these observations?
Hey, I used other simple "_http._tcp.local." mDNS Discovery code as examples for my pull request. To be honest - I not made up my mind regard threading, because the BridgeDiscoveryParticipant instance is called by the openhab framework and the framework is a back box.
It makes no sense to me to do so many discovery tries in a short amount of time - every 30 seconds is already very fast from my point of view. I agree that many requests will put a lot of stress on the smart home controller, because it has not a huge processing power. If necessary the SHC code could "cache" the last discovery result and return the cache value for a couple of discovery calls from the framework.
I do not know the details of the framework, but could imagine that bindings could register themselves to the mDNS discovery process. On every call they would need to check how to handle the call and what to action. In case the device is already registered/known, why not simply discarding the call? This would effectively minimize the number of arbitration and registration processes. Anything else would not make sense to me as the device has already been identified and properly processed. Why to repeat the same over and over again?
Just my 2cents from an uneducated user.
Am 03.01.2024 um 20:01 schrieb Gerd Zanker @.***>:
Hey, I used other simple "_http._tcp.local." mDNS Discovery code as examples for my pull request. To be honest - I not made up my mind regard threading, because the BridgeDiscoveryParticipant instance is called by the openhab framework and the framework is a back box.
It makes no sense to me to do so many discovery tries in a short amount of time - every 30 seconds is already very fast from my point of view. I agree that many requests will put a lot of stress on the smart home controller, because it has not a huge processing power. If necessary the SHC code could "cache" the last discovery result and return the cache value for a couple of discovery calls from the framework.
— Reply to this email directly, view it on GitHub https://github.com/openhab/openhab-addons/issues/16174#issuecomment-1875828004, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHVH7GHTUEGEVWAKYIJEVATYMWTIBAVCNFSM6AAAAABBJMOROGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNZVHAZDQMBQGQ. You are receiving this because you were mentioned.
Possible relation to https://github.com/openhab/openhab-core/pull/3906
I will also test the latest SNAPSHOT
state because there was another change in the ThingHandlerService
which also affected our boschshc
ThingDiscoveryService
: https://github.com/openhab/openhab-core/pull/3957
The latest main
state also shows the same issues.
I found out that openHAB indeed receives quite many events from the javax.jmdns
service. All of these events are forwarded to the discovery participants and this is done is individual threads.
I have an experimental implementation running on my system which caches the discovery results using the IP address as key. This seems to work fine so far. In the whole log I only see the Discovery failed with exception java.net.ConnectException: Connection refused
four times, which seems adequate. After that the result is cached and no physical connection attempts are made afterwards.
I also removed the participant-specific object bridgeInformation
which was shared among all the different threads, so this might also have been a problem.
@GerdZanker do you think this approach is viable in general or do you have a better idea? Are IP addresses ok as keys for the cache? Do we need advanced features like evicting entries from the cache after n minutes or something like that?
However, I have just checked my PI2 test instance with the new test version of the binding and it is running fine for 11 hours now. This looks very promising and indicates that the issues on the PI2 were caused by the massive mDNS events.
Interestingly the many events do not harm BSHC on my PI4. Maybe because the PI4 has sufficient processing power to cope with the many requests. Nevertheless this looks like a design flaw - either a regression or by purpose and we do not understand the rationale. I’d suggest raising an issue for javax.jmdns though I assume this is part of the underlying framework and not openHAB specific and causing a broader impact. Independent on the framework issue, a good strategy of the individual binding would be to minimize load and traffic supporting a defensive strategy.
If you don’t mind, I’d like to share my 2cents on a caching strategy:
In my PoV the (caching) strategy depends on: What information comes with the discovery event? Could it be e.g. a changed service indicator? What information is returned as discovery results from the Bosch binding and what is the purpose? What are the benefits to initiate another discovery for a device which has already been successfully discovered and registered -> i.e. the binding has already created a thing what would another discovery add? a device which has been identified as not matching -> like the printer: why bothering the printer with BSHC specific requests all the time? It is unlikely, that it morphs into a Bosch bridge What are the use cases to re-discovery on an existing device?
An IP address as a key seems to be easy to implement. However, we then need to cater for IP address changes. In theory a Printer and the Bosch bridge could swap addresses e.g. after a (local) power failure and restart or a user manually re-arranges IP addresses… These may be rare cases which are also not high frequency. Why not following a pragmatic approach? add a user adjustable parameter to the binding setting a TTL in seconds use IP address as key and store the time stamp - if on next request current time > timestamp + TTL, discard the entry and initiate rediscover clear cache on restart of a thing - may not 100% reliable but hey how many users are having more than a handful of bosch bridges? cache should be transient anyway and cleared on binding restart
Thanks for all your good support and effort!
Am 04.01.2024 um 18:20 schrieb David Pace @.***>:
The latest main state also shows the same issues. I found out that openHAB indeed receives quite many events from the javax.jmdns service. All of these events are forwarded to the discovery participants and this is done is individual threads. I have an experimental implementation running on my system which caches the discovery results using the IP address as key. This seems to work fine so far. In the whole log I only see the Discovery failed with exception java.net.ConnectException: Connection refused four times, which seems adequate. I also removed the participant-specific object bridgeInformation which was shared among all the different threads, so this might also have been a problem. @GerdZanker do you think this approach is viable in general or do you have a better idea? Are IP addresses ok as keys for the cache? Do we need advanced features like evicting entries from the cache after n minutes or something like that? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.Message ID: @.***>
@GerdZanker do you think this (cache) approach is viable in general or do you have a better idea? Are IP addresses ok as keys for the cache? Do we need advanced features like evicting entries from the cache after n minutes or something like that?
Yes, caching or other way to finish the discovery as early and as cheap as possible is good. Have you evaluate to use the complete ServiceInfo instance as a key? Using the IP address is also fine, because I like the documented trade-off from @mike-bike for a pragmatic approach.
I'm still wonder why we run now into the discovery trouble:
MDNSDiscoveryParticipant
code, no code uses caching.getPropertyString(...)
or getName()
return values, which is currently not done in the Bosch SHC code.Thank you for your input :+1: Let's discuss further ideas for enhancements in the pull request I just created.
Expected Behavior
Current Behavior
When setting the log level to
DEBUG
for theboschshc
binding, we see a lot of errors like the following in the log:The exceptions as such are to be expected, since the discovery tries to call the URL
https://[IP]:8443/smarthome/public/information
on every device that advertises a service with the mDNS pattern_http._tcp.local.
. Since the pattern is very generic, it can be expected that some devices will not accept a connection with that URL on port8443
and thus the discovery gets aconnection refused
response or runs into a timeout.However, the following things are not expected:
For analysis, I enhanced the logging on my system and I see "blocks" of ConnectExceptions and SocketTimeoutExceptions like this:
When setting a breakpoint in the debugger, I see that multiple threads try to query the same IP address concurrently. In this case the device is concurrently queried in 9 threads. I don't know if this is relevant but the device with the
.48
address is a printer in my case.Issues with the discovery were also reported by our users @xxthunder and @mike-bike in https://github.com/openhab/openhab-addons/pull/16002 and https://github.com/openhab/openhab-addons/issues/15912.
Possible Solution
logger.trace("Discovering ipAddress {}", url);
is misleading: URL is not an IP addresslogger.debug("Discovering failed with status code: {}", contentResponse.getStatus());
andlogger.debug("Discovering failed with exception {}", e.getMessage());
:Discovery
instead ofDiscovering
, discuss if "failed" is the right wording here because this is expected for non-SHC devices, discuss log leveldebug
debug
level (right now the IP can only be seen ontrace
level, but it can't be correlated with the exceptions because of the concurrency / multiple threads)Maybe @GerdZanker could take a look at this because he initially implemented the discovery and might know some important details?
Steps to Reproduce (for Bugs)
Prerequisites: at least one device in the network apart from the Bosch Smart Home Controller that advertises a HTTP server (e.g. a printer).
boschshc
binding toDEBUG
orTRACE
openhab.log
fileContext
Discovery should run regularly in a stable manner and provide appropriate log entries
Your Environment
4.1.0