eclipse-archived / smarthome

Eclipse SmartHome™ project
https://www.eclipse.org/smarthome/
Eclipse Public License 2.0
865 stars 782 forks source link

[LIFX] Bulb offline issues #2193

Closed wborn closed 8 years ago

wborn commented 8 years ago

The LIFX binding is still pretty unstable for me when using it with my 34 bulbs.

Bulbs go offline a lot. I've been trying to get it stable. Bulbs go offline less when I use MAXIMUM_POLLING_RETRIES = 4; in the LifxLightHandler. So maybe the value can be increased or made configurable e.g. via Paper UI?

Also when bulbs go offline often the following error is logged:

22:09:51.993 [INFO ] [me.event.ThingStatusInfoChangedEvent] - 'lifx:colorlight:XXXXXXXXXXXX' changed from OFFLINE (COMMUNICATION_ERROR) to ONLINE
22:09:39.663 [ERROR] [inding.lifx.handler.LifxLightHandler] - An exception orccurred while communicating with the bulb : 'null'

The orccurred typo is actually very helpful because it helps to pin point which of the two similar log statement needs to be adjusted to also log a the stack trace.

This is the exception that gets logged with the modified log statement.

22:45:46.013 [ERROR] [inding.lifx.handler.LifxLightHandler] - An exception orccurred while communicating with the bulb
java.util.ConcurrentModificationException
    at java.util.HashMap$HashIterator.nextNode(HashMap.java:1437)[:1.8.0_101]
    at java.util.HashMap$KeyIterator.next(HashMap.java:1461)[:1.8.0_101]
    at org.eclipse.smarthome.binding.lifx.handler.LifxLightHandler$1.run(LifxLightHandler.java:430)[214:org.eclipse.smarthome.binding.lifx:0.9.0.201609192042]
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)[:1.8.0_101]
    at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)[:1.8.0_101]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)[:1.8.0_101]
    at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)[:1.8.0_101]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)[:1.8.0_101]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)[:1.8.0_101]
    at java.lang.Thread.run(Thread.java:745)[:1.8.0_101]

Maybe this exception occurs when the Selector is closed in LifxLightHandler.dispose() as is suggested on StackOverflow?

Also I have a constant stream of warnings:

22:07:39.973 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '206' has already been sent to the bulb. Is it missing in action? 
22:07:40.926 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '103' has already been sent to the bulb. Is it missing in action? 
22:07:42.982 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '208' has already been sent to the bulb. Is it missing in action? 
22:07:45.991 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '210' has already been sent to the bulb. Is it missing in action? 
22:07:48.197 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '100' has already been sent to the bulb. Is it missing in action? 
22:07:48.197 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '101' has already been sent to the bulb. Is it missing in action? 
22:07:48.999 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '211' has already been sent to the bulb. Is it missing in action? 
22:07:48.999 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '212' has already been sent to the bulb. Is it missing in action? 
22:07:52.009 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '214' has already been sent to the bulb. Is it missing in action? 
22:07:55.017 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '216' has already been sent to the bulb. Is it missing in action? 
22:07:55.017 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '217' has already been sent to the bulb. Is it missing in action? 
22:07:58.026 [WARN ] [inding.lifx.handler.LifxLightHandler] - A messages with sequence number '219' has already been sent to the bulb. Is it missing in action? 

I think it is partly caused due to the bulbs going offline. That would also explain the sequential nature of all these numbers.

At line 530 of the LifxLightHandler I see that the sentPackets is cleared when a bulb is set OFFLINE. However this is not done when the bulb is set to OFFLINE at other locations, i.e. lines 219, 711, 736 . So maybe it should also be added at these locations?

When these issues are resolved, I do think most of the reliability issues are solved for this binding! :smile:

Do you have any thoughts on this @kgoderis?

kgoderis commented 8 years ago

@wborn You are definitely in a position to this kind of testing - I only have 3 bulbs, poor me ;-)

I will implement the StackOverflow solution, and you are right about clearing sendPackts.

wborn commented 8 years ago

@kgoderis thanks for looking into my LIFX issues. When the LIFX binding is stable, I will be a very happy person, because I can finally retire my own coded HA solution with ESH/OH.

My own LIFX implementation is rock solid, but not as versatile as yours (only static IPs). I kept things very simple and stupid by just sending packets 3 times to a bulb. That way they always arrive and I didn't have to bother with acknowledgements. That is of course not really in line with LIFX API specs. :-)