home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
72.67k stars 30.42k forks source link

HA looses link with Kodi after restart of Kodi #26664

Closed plapointe6 closed 4 years ago

plapointe6 commented 5 years ago

Home Assistant release with the issue: 0.98.5

Last working Home Assistant release (if known): Never ? Not sure ...

Operating environment (Hass.io/Docker/Windows/etc.): Docker, but seem to happen on all environments (if I take a look to other related issues)

Component/platform: Kodi : https://www.home-assistant.io/components/kodi/

Problem-relevant configuration.yaml entries and (fill out even if it seems unimportant):

media_player:
  - platform: kodi
    name: Kodi TV
    host: 192.168.1.102

Description of problem: I run into the same issue since day 1 of Home Assistant usage : When Kodi is restarted, the connection from HA to Kodi doesn't re-establish. The only way to get HA to receive status changes is by restarting HA.

I seen many informations about that while I was searching into closed issues and pull request. The last issue was closed weeks ago, so I decided to open a recent one that make a summary of what I found revelent about the issue. I hope it will help the Home assistant community the get into this and maybe solve the problem :)

17265 : The last issue that was opened about that.

You got it, I stoled the title and the description of the issue from this old issue because I run into the exactly same problem. This one was opened last year by @Xaldimo and @tadly managed to put many hours discussing and trying to resolve it. In fact, it seems that his work get us really close to a working solution.

So, let dive into was it has been made by @tadly to solve this issue:

My goal was to open an up-to-date issue about this ever-going issue and to summarise what it has been done in the most recent discussins. I hope that this time it will lead to a fix that will be applied to HA and I will help the best I can to make this happen.

dshokouhi commented 5 years ago

Thats odd, I am able to restart Kodi and the device I run Kodi on just fine and HA is able to recover it without needing a HA restart. There must be something else causing this to happen for some and not all. You should most definitely list what platforms you use on Kodi and what settings you have enabled. Kodi on my Nvidia Shield, Fire Sticks, and Windows devices do not have this issue. I have the webserver setup with all necessary options enabled as well.

My HA is installed in a venv as well.

plapointe6 commented 5 years ago

Hi @dshokouhi, like you said, its odd ! You are the first one I see that do not have the problem.

First, I ommited to say that these two things must also happens in order to reproduce the issue : 1- enable_websocket must be set to True (the default value) in you config.yaml 2- Kodi must be online when you start HA.

If these two conditions are met, putting Kodi offline and starting it again after few seconds lead to a broken link between Kodi and HA, because HA will not attempt to reconnect to Kodi after the connection is lost.

If it matters, I run kodi on a Raspberry PI with the LibreELEC image. Not sure about the webserver settings, I haven't configured that.

dshokouhi commented 5 years ago

@plapointe6 you need to configure the webserver per the docs. Although it does say its only required for discovery since you are supplying the username and password that is for the webserver. Try following these steps: https://kodi.wiki/view/Web_interface and see if it makes things better.

in my setup I did not turn off websockets so its on by default and for me it doesnt matter whether Kodi is on or not when I restart HA its always working properly.

tadly commented 5 years ago

@dshokouhi you will have this issue if you:

  1. really use websockets (maybe you disabled them in kodi, the port isn't accessible, ... in which case the component defaults to http polling i think)
  2. (re)start HA while Kodi is running
  3. restart Kodi => Now kodi is offline within HA until you manually force the component to update (send a play/pause event or something else)

why? Because of this. As outlined in OP, should_poll shouldn't and - as a result of the discussions in one of the previous issues - isn't allowed to be dynamic. However it still is meaning you will have this issue.

plapointe6 commented 5 years ago

@dshokouhi I took a look to the link you sent and the "Allow remote control via HTTP" was already activated in my Kodi installation. (I activated this to allow remote control trough my smartphone with Yatse, I remember now).

Now, I agree that there is flaws in the code and @tadly highligted it. However, things get odd. I run into this issue frequently. Just before opening this issue yesterday, everything was fine and linked, I restarted Kodi and boom, the link was lost, another time.

Today, without changing anything in the Kodi/HA config, I restarted HA while Kodi was online and I restarted Kodi multiple time after that. The issue won't happen today, Kodi reconnect automatically. I will not get fooled by this, the issue will arise again, and again, and again if it's not adressed.

I haven't played with Python very much, and I don't know how HA works, but maybe it's time for me to learn how HA and his components are made to better understand what's going on.

dshokouhi commented 5 years ago

@plapointe6 the only issue i see from time to time that I have not been able to reproduce (hence have not filed a bug) is occasionally the state gets stuck in idle when Kodi has been on for some time and all I do is send a service call like pressing the play button in the UI and it starts showing the correct state. Not sure when this happens but I see it maybe once a month if not more than that.

plapointe6 commented 5 years ago

Interesting. Not 100% sure, but I think that in my case, the state go to Off. Also, the Kodi component will not get updated. The automations to turn on/off the lights when I start/stop listening trough Kodi stop working and so.

Most of the time (if not always), It happens after a restart of Kodi. (I will keep you updated, because I should be easily able to reproduce the issue ...)

plapointe6 commented 5 years ago

Ok, @dshokouhi, I managed to reproduce your issue. And maybe the issue you encounter is the same as mine.

To reproduce it : 1- Make sure you use websockets. 2- Make sure you started Home Assitant while all your Kodi instance was online. 3- Brutally close the websocket connection by disconnecting the power of the computer that run Kodi. 4- Re-connect the power and start Kodi again.

The state of Kodi within HA stay to idle and it won't be updated anymore. The link seem broken. The only way to get Kodi and HA in sync is to either restart HA or send a command to Kodi (for me, if I send a volume up/down command, it does the thrick).

Now somethime it also arise when you just restart Kodi instead of brutally disconnecting the power, but if you brutally disconnect the power while Kodi is running, it seems always arise.

Maybe you could give it a try ?

lolorc commented 5 years ago

confirmed here with 0.98.2 and websockets. it happened to me after using sysrq to restart the host. in that case the socket wasn't cleanly closed.

tadly commented 5 years ago

It's not about the socket not getting closed properly. The issue is that after restarting Kodi, HA never polls again to see if it is back online.

plapointe6 commented 5 years ago

@tadly, I am pretty sure that's linked to the same issue (the websocket disconnects and HA won't poll Kodi to detect if it's back online). However, I will take a look to the code to better understand what's going on instead of exposing my theories here without knowing how the Kodi component work.

probot-home-assistant[bot] commented 5 years ago

Hey there @armills, mind taking a look at this issue as its been labeled with a integration (kodi) you are listed as a codeowner for? Thanks!

plapointe6 commented 5 years ago

I made more testing and I took a look to the Kodi component code.

I discovered that should_poll can change trough the lifespan of the Kodi component.

In fact, Should_poll will become True if we cleanly loose the websocket connection. This happen when async_on_quit is called by one of the three disconnection event of the web socket (see line 312 to 314 of components/kodi/media_player.py).

Following the async_on_quit actions, self._ws_server.connected (used in should_poll) will now return False and should_poll will now return True. In this case, HA will poll Kodi the retablish the websocket connection.

To demonstrate that, I added debug message into should_poll, async_ws_connect, async_on_quit and async_update. Here is the result of my testing :

2019-09-17 19:30:34 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-17 19:30:34 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_ws_connect
2019-09-17 19:30:34 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True
2019-09-17 19:30:45 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-17 19:30:56 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-17 19:31:07 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-17 19:31:18 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
HERE I RESTARTED KODI
2019-09-17 19:31:18 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_on_quit
2019-09-17 19:31:29 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True <-----
2019-09-17 19:31:34 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-17 19:31:40 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True <-----
2019-09-17 19:31:40 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-17 19:31:51 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True <-----
2019-09-17 19:31:51 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-17 19:31:51 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_ws_connect
KODI WAS BACK ONLINE AND HA RECONNECTED TO IT
2019-09-17 19:32:02 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False <-----
2019-09-17 19:32:13 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-17 19:32:24 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-17 19:32:35 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-17 19:32:46 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-17 19:32:57 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False

Now, if Kodi and HA abrutly loose the websocket connection, HA never know that Kodi was put offline and will never attempt a reconnection. async_on_quit will never be called. I think that this is the major issue.

amelchio commented 5 years ago

As I originally commented in the PR, I believe the only thing missing from #17474 is to return early in async_update() if self._enable_websocket is true.

All the emphasized words below my comment relate to my second paragraph which was just an optional suggestion.

tadly commented 5 years ago

@amelchio in your followup comment you said:

That proposal included moving the reconnect logic out of update() so polling is not needed.

This entails a lot of work as now the component itself has to monitor/poll kodi in case it goes offline. That's no easy feat (I tried and ran into a very specific issue which would have required me to rewrite half the component) and the PR I initially created is no real basis for that.

amelchio commented 5 years ago

Again ... that was an optional suggestion, not something blocking a merge.

tadly commented 5 years ago

Always polling Kodi however was blocking a merge and never polling Kodi will break the component. => your suggestion is required

plapointe6 commented 5 years ago

What if we find a way to detect a WebSocket connection lost and manually trigger async_on_quit ? That will close the websocket accordigly and will make should_poll return True until Kodi reconnects.

The question is : do the websocket lib used by Kodi provide a way to recognize an unclean websocket connection lost ?

tadly commented 5 years ago

@amelchio I read through the comments of the linked PR again and just to clear up any possible misunderstandings.

If you say the previous PR is fine with the addition of...

return early in async_update() if self._enable_websocket is true.

... does that mean it would get merged? Or are we going to end up in another endless discussion

@plapointe6 again, should_poll can not changed at runtime (while HA is running). If it was False at the time HA started, it will ALWAYS be False until HA restarts again. Same applies if it was initially True

plapointe6 commented 5 years ago

@tadly, did you see the result of my testing ? I proved that should_pollis called at each ten seconds or so and if it become True after a clean websocket disconnection, async_update is called again until the websocket reconnect.

Edit: here if you missed it.

tadly commented 5 years ago

@plapointe6 now I did. Might explain why I couldn't reproduce the usual issue earlier on... This would suggest that core changed and this statement by @amelchio is no longer valid.

Fine by me.

Given those circumstances your suggestion seems like a easy and acceptable fix.

tadly commented 5 years ago

@plapointe6 this is what you're looking for

Instead of pass you would want something like self.async_on_quit(None, None)

Maybe it's enough to self.async_schedule_update_ha_state(True) leaving the pass as is.

amelchio commented 5 years ago

The original issue of not calling should_poll required that no other polling media player exists, analysis here.

What @plapointe6 is talking about now sounds like a slightly different situation where should_poll is called, but it does not return True.

@tadly Yes, I would be okay with merging it like that.

plapointe6 commented 5 years ago

@tadly, I tryed exactly what you propose yesterday, but it seem that it wont enter into the except even if the websocket end abruptly. Maybe I made a mistake, I think I have not removed the pass, I only added the call to self.async_on_quit(None, None) before the pass. It's one of the first time I mess with Python (I am more a C++ guy).

I will try it again tonight, this time removing the pass. I think it can also be a good idea to add the finally clause, like in the last exemple from https://github.com/armills/jsonrpc-websocket

I am not sure that calling self.async_schedule_update_ha_state(True) and letting the pass as is will be enouph. I dont think that doing that will lead the component to know that he is not online anymore ?

tadly commented 5 years ago

The pass in python is a null operator (not sure what it's officially called) and does literally nothing. Because python doesn't use brackets (and relies on indentation instead) you have to have something in the except clause, that's why python has pass (alongside other reasons) :)

In other words, no more testing needed if it didn't work the way you tested it.

For an easier test you could add a debug output before this line and test again. If you see the debug, great. If you don't -> bigger issue, more digging

plapointe6 commented 5 years ago

Thanks, I will test that and keep you updated.

plapointe6 commented 5 years ago

I modified the ws_loop_wrapper function like this to print debug messages :

        async def ws_loop_wrapper():
            """Catch exceptions from the websocket loop task."""
            try:
                await ws_loop_future
            except jsonrpc_base.TransportError:
                # Kodi abruptly ends ws connection when exiting. 
                # Trigger the async_on_quit event to cleanly close the ws connection
                _LOGGER.debug("ws_loop_wrapper - jsonrpc_base.TransportError")
                self.async_on_quit(None, None)
            finally:
                _LOGGER.debug("ws_loop_wrapper - finally")

            _LOGGER.debug("ws_loop_wrapper - end")

Here is the testing I made and the revelent logs during my testing :

2019-09-18 18:39:50 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-18 18:39:50 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update - shedule async_ws_connect
2019-09-18 18:39:50 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_ws_connect
2019-09-18 18:39:50 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True
...
2019-09-18 18:43:41 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-18 18:43:52 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-18 18:44:03 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
HERE I RESTARTED KODI NORMALLY
2019-09-18 18:44:12 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_on_quit
2019-09-18 18:44:12 DEBUG (MainThread) [custom_components.kodi_custom.media_player] ws_loop_wrapper - finally
2019-09-18 18:44:12 DEBUG (MainThread) [custom_components.kodi_custom.media_player] ws_loop_wrapper - end
2019-09-18 18:44:14 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True
2019-09-18 18:44:14 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-18 18:44:25 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True
2019-09-18 18:44:28 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-18 18:44:36 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True
2019-09-18 18:44:36 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-18 18:44:47 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True
2019-09-18 18:44:47 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-18 18:44:47 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_ws_connect
2019-09-18 18:44:58 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
HERE KODI WAS ONLINE AND HA RECONNECTED TO IT
2019-09-18 18:45:09 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
...
2019-09-18 18:47:43 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-18 18:47:54 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
HERE I ABRUBTLY SHUTTED DOWN KODI BY DISCONNECTING THE POWER
2019-09-18 18:48:05 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-18 18:48:16 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-18 18:48:27 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-18 18:48:38 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
HERE, KODI WAS BACK ONLINE BUT THE LINK WITH HA WAS BROKEN, I PRESSED THE VOILUME DOWN COMMAND TROUGH HA
2019-09-18 18:48:40 DEBUG (MainThread) [custom_components.kodi_custom.media_player] ws_loop_wrapper - finally
2019-09-18 18:48:40 DEBUG (MainThread) [custom_components.kodi_custom.media_player] ws_loop_wrapper - end
2019-09-18 18:48:45 ERROR (MainThread) [custom_components.kodi_custom.media_player] Error calling async_volume_down on entity media_player.kodi_television: TransportError("Error calling method 'Input.ExecuteAction': Transport Error", TimeoutError())
2019-09-18 18:48:45 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is True
2019-09-18 18:48:45 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update
2019-09-18 18:48:45 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_ws_connect
HERE KODI AND HA WAS BACK IN SYNC
2019-09-18 18:48:49 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-18 18:49:00 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False
2019-09-18 18:49:11 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Should Poll is False

Conclusion : The except is not triggered if an "unclean" WebSocket connection lost occur.

tadly commented 5 years ago

Thanks for the extensive testing.

Doing some more code reading and googeling I think I finally found something.

The library used by the kodi component uses aiohttp which had an issue for abruptly closing connection.

The solution outlined in the issue is using a heartbeat (which by jsonrpc spec is a simple ping/pong).

To test if this actually is working, do the following: Change https://github.com/home-assistant/home-assistant/blob/8a39924b37486bf806467eafefaa654c33ada45b/homeassistant/components/kodi/media_player.py#L280 to (reverting all previous changes):

kwargs = {"timeout": timeout, "heartbeat": 3, "session": async_get_clientsession(hass)}

This should at least clear up the failed/broken connection. Not sure if we need more than this for a finalized fix but we shall see :) (please add debug logs again)

@amelchio what would be a acceptable heartbeat interval? The 3 sec. are just for testing purposes. Is scan_interval exposed and should we use that doing it all in the background? Or is a custom option specific to Kodi more desirable allowing the user to configure the interval, possibly turning it of by setting it to <=0?

amelchio commented 5 years ago

@tadly I would use a minute or two but anything above 10 seconds is probably acceptable. Too short keep-alive intervals have been known to cause undesired flip-flops in other situations though I am not sure how websockets work in that regard.

plapointe6 commented 5 years ago

I get an error after adding the heartbeat argument to kwargs in media_player.py.

kwargs = {"timeout": timeout, "heartbeat": 3, "session": async_get_clientsession(hass)}

Ha log

2019-09-19 19:03:17 ERROR (MainThread) [homeassistant.components.media_player] kodi_custom: Error on device update!
Traceback (most recent call last):
  File "/usr/src/homeassistant/homeassistant/helpers/entity_platform.py", line 292, in _async_add_entity
    await entity.async_device_update(warning=False)
  File "/usr/src/homeassistant/homeassistant/helpers/entity.py", line 439, in async_device_update
    await self.async_update()
  File "/config/custom_components/kodi_custom/media_player.py", line 459, in async_update
    self._players = await self._get_players()
  File "/config/custom_components/kodi_custom/media_player.py", line 400, in _get_players
    return await self.server.Player.GetActivePlayers()
  File "/usr/local/lib/python3.7/site-packages/jsonrpc_async/jsonrpc.py", line 29, in send_message
    response = yield from self._request(data=message.serialize())
  File "/usr/local/lib/python3.7/site-packages/aiohttp/client.py", line 869, in post
    **kwargs))
TypeError: _request() got an unexpected keyword argument 'heartbeat'
tadly commented 5 years ago

Damn it, nothing's ever easy. I'll look at it myself sometime on monday. Thanks for testing.

tadly commented 5 years ago

Just a quick update.

I did have a look on monday (as promised) but realized that the component just wouldn't use websockets at all for me but instead purely relied on http polling. This was the case even though kodis socket is accessible.

I haven't had more time to do additional testing ever since

Elkropac commented 5 years ago

Hello, any news on this? I have automation, which turns off my power outlet with kodi few minutes after it changes it's state to off. Because of this problem, it was switched off several times while kodi still running. (it randomly connects to kodi and then looses connection).

It started after upgrade of HA from 0.95.4 to 0.99.3

Thanks

plapointe6 commented 5 years ago

Not sure your issue is the same as this one. The main issue here is the fact that Home Assistant won't detect a connetion lost from Kodi when the websocket abrubtly ends his connection.

The problem you are describing seem to refer as a random connection lost/reconnection between Home assistant and Kodi.

I ran out of time in the last weeks, I will continue to work on a fix for this issue soon. Also, I am awaiting the 0.100.0 release to update HA, I will upgrade as soon as it will be released to do more testings. I am actually on 0.98.5 and you are with 0.99.3

plapointe6 commented 5 years ago

Update : Tested with HA 0.100.2, the problem still here.

I managed to activate the jsonrpc-websocket heartbeat system, but it does't not work as expected. I suspect Kodi from not responding to the heartbeat "ping" request. When activated with a delay of 30 seconds, the websocket connect/disconnect each 30 seconds.

tadly commented 5 years ago

I checked with other members over at kodi and yeah... Kodis websocket implementation is from a time where the whole websocket spec was still just a draft, one that didn't define a heartbeat system.

This makes things more difficult as the only other option we have is to implement the hearbeat ourself using kodis JSONRPC.Ping.

@amelchio Is there a function repeatedly called where we could slip in a custom heartbeat? AFAIK should_poll will be but adding it there is kinda hacky...

A custom heartbeat system could look something like this? It's untested and just a quick draft. I'm sure I forgot relevant things...

# Existing function
async def async_ws_connect(self):
  """Connect to Kodi via websocket protocol."""
  ....
  self.hass.loop.create_task(ws_loop_wrapper())

  # This is new
  self.hass.loop.create_task(heartbeat())

# New function
async def heartbeat(self):
  if not self._enable_websocket:
    return

  # Make sure connection has been established
  await asyncio.sleep(1)

  # Don't use self._ws_server.connected because in case of dropped
  # connections it might not be reliable
  while True:
    res = await self.server.JSONRPC.Ping()
    if res != "OK":
        # Not sure if this is "enough"...
        # Kodi might have restarted so we should try and re-establishing the connection
        self.async_on_quit(None, None)

        # I suspect self._ws_server.close() not to work either given the state of the socket in this case
        # Setting it to None is not ideal and we should investigate how to properly clean
        # up the the socket
        self._ws_server = None
        break

    # Wait before sending the next heartbeat
    await asyncio.sleep(20)

Anyways... that's all I have time for. Sorry

amelchio commented 5 years ago

@tadly I think it would be okay to branch in async_update(): do the ping if you have a websocket, otherwise poll data.

tadly commented 5 years ago

If my brainstorm is correct, async_update won't help much because it's only called if should_poll is True.

In our special case, should_poll will be False because the socket is stuck in a broken connected state. => no poll -> no async_update

Hope I'm wrong but as a reference see the log in this comment by @plapointe6

amelchio commented 5 years ago

Make should_poll always be True.

tadly commented 5 years ago

And there we go full circle basically :)

@plapointe6 need you to verify something for me if you have the time. Recreate the following. I hope it's clear what I want (:

    async def async_update(self):
        """Retrieve latest state."""
        _LOGGER.debug('Is connected: {}'.format(self._ws_server.connected))
        return # This return is important! We don't want to execute anything on the failed socket
        ...

    @property
    def should_poll(self):
        """Return True if entity has to be polled for state."""
        return True

Then:

  1. Start kodi
  2. Start HA
    1. Make sure kodi is connected
    2. you have the log once or twice (happens every 10 seconds)
  3. Nuke kodi as hard as you can (unplug power basically)
  4. Let me know what the log says

I only have battery powerd stuff and even if I kill -9 <kodi-pid>, self._ws_server.connected returns False meaning the socket knows it's not connected anymore. Just gotta make sure this is also the case in your extrem scenario.

amelchio commented 5 years ago

@tadly The difference to #17474 is that my proposal does not poll when the websocket is connected (but now it pings).

plapointe6 commented 5 years ago

@tadly, self._ws_server.connected will keep returning true after an "unclean" Kodi websocket disconnection. It's why we needs to do a ping in async_update.

Why I am saying that ? Because of the should_poll code return not (self._enable_websocket and self._ws_server.connected) and my previous testings.

tadly commented 5 years ago

I see your point and I could have figured that out myself ':D I just wanted to make sure because kill -9 should give me the same result but it doesn't. This means I straight up can not reproduce this.

This is what I've got so far. Can you give this a try?

async def async_update(self):
    """Retrieve latest state."""
    _LOGGER.debug('Is connected: {}'.format(self._ws_server.connected))

    if self._enable_websocket and self._ws_server.connected:
        _LOGGER.debug('Sending heartbeat.')
        res = await self._ws_server.JSONRPC.Ping()
        if res != "pong":
            _LOGGER.warning('Socket connection failed! Dropping it')
            self.hass.async_create_task(self._ws_server.close())

        return

    self._players = await self._get_players()
    ...

@property
def should_poll(self):
    """Return True if entity has to be polled for state."""
    return True
plapointe6 commented 5 years ago

Ok, I will test that tonight when I will return home :)

plapointe6 commented 5 years ago

I made a quick test. I needed to change the code a little to make it work, here is the new async_update in media_player.py :

    async def async_update(self):
        """Retrieve latest state."""

        import jsonrpc_base

        if self._enable_websocket: 
            if self._ws_server.connected:
                _LOGGER.debug('Sending heartbeat.')
                ws_error = False

                try:
                    res = await self._ws_server.JSONRPC.Ping()
                    if res != "pong":
                        ws_error = True
                except jsonrpc_base.jsonrpc.TransportError:
                    ws_error = True

                if ws_error:
                    _LOGGER.warning('Socket connection failed! Dropping it')
                    self.hass.async_create_task(self._ws_server.close())
            else:
                self.hass.async_create_task(self.async_ws_connect())
                _LOGGER.debug("async_update - shedule async_ws_connect")

            return

        self._players = await self._get_players()

        if self._players is None:
            self._properties = {}
            self._item = {}
            self._app_properties = {}
            _LOGGER.debug("async_update - self._players is None")
            return

        self._app_properties = await self.server.Application.GetProperties(
            ["volume", "muted"]
        )

        if self._players:
            player_id = self._players[0]["playerid"]

            assert isinstance(player_id, int)

            self._properties = await self.server.Player.GetProperties(
                player_id, ["time", "totaltime", "speed", "live"]
            )

            position = self._properties["time"]
            if self._media_position != position:
                self._media_position_updated_at = dt_util.utcnow()
                self._media_position = position

            self._item = (
                await self.server.Player.GetItem(
                    player_id,
                    [
                        "title",
                        "file",
                        "uniqueid",
                        "thumbnail",
                        "artist",
                        "albumartist",
                        "showtitle",
                        "album",
                        "season",
                        "episode",
                    ],
                )
            )["item"]
        else:
            self._properties = {}
            self._item = {}
            self._app_properties = {}
            self._media_position = None
            self._media_position_updated_at = None

Here is the logs when I unplugged the power

2019-10-17 18:21:53 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Sending heartbeat.
2019-10-17 18:22:04 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Sending heartbeat.
2019-10-17 18:22:09 WARNING (MainThread) [custom_components.kodi_custom.media_player] Socket connection failed! Dropping it
2019-10-17 18:22:09 DEBUG (MainThread) [custom_components.kodi_custom.media_player] ws_loop_wrapper - end
2019-10-17 18:22:15 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update - shedule async_ws_connect
2019-10-17 18:22:15 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_ws_connect
2019-10-17 18:22:15 INFO (MainThread) [custom_components.kodi_custom.media_player] Unable to connect to Kodi via websocket
2019-10-17 18:22:15 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Unable to connect to Kodi via websocket
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/aiohttp/connector.py", line 936, in _wrap_create_connection
    return await self._loop.create_connection(*args, **kwargs)  # type: ignore  # noqa
  File "/usr/local/lib/python3.7/asyncio/base_events.py", line 954, in create_connection
    raise exceptions[0]
  File "/usr/local/lib/python3.7/asyncio/base_events.py", line 941, in create_connection
    await self.sock_connect(sock, address)
  File "/usr/local/lib/python3.7/asyncio/selector_events.py", line 464, in sock_connect
    return await fut
  File "/usr/local/lib/python3.7/asyncio/selector_events.py", line 494, in _sock_connect_cb
    raise OSError(err, f'Connect call failed {address}')
ConnectionRefusedError: [Errno 111] Connect call failed ('192.168.1.102', 9090)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/jsonrpc_websocket/jsonrpc.py", line 56, in ws_connect
    self._url, **self._connect_kwargs)
  File "/usr/local/lib/python3.7/site-packages/aiohttp/client.py", line 728, in _ws_connect
    proxy_headers=proxy_headers)
  File "/usr/local/lib/python3.7/site-packages/aiohttp/client.py", line 483, in _request
    timeout=real_timeout
  File "/usr/local/lib/python3.7/site-packages/aiohttp/connector.py", line 523, in connect
    proto = await self._create_connection(req, traces, timeout)
  File "/usr/local/lib/python3.7/site-packages/aiohttp/connector.py", line 859, in _create_connection
    req, traces, timeout)
  File "/usr/local/lib/python3.7/site-packages/aiohttp/connector.py", line 1004, in _create_direct_connection
    raise last_exc
  File "/usr/local/lib/python3.7/site-packages/aiohttp/connector.py", line 986, in _create_direct_connection
    req=req, client_error=client_error)
  File "/usr/local/lib/python3.7/site-packages/aiohttp/connector.py", line 943, in _wrap_create_connection
    raise client_error(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host 192.168.1.102:9090 ssl:None [Connect call failed ('192.168.1.102', 9090)]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/config/custom_components/kodi_custom/media_player.py", line 435, in async_ws_connect
    ws_loop_future = await self._ws_server.ws_connect()
  File "/usr/local/lib/python3.7/site-packages/jsonrpc_websocket/jsonrpc.py", line 58, in ws_connect
    raise TransportError('Error connecting to server', None, exc)
jsonrpc_base.jsonrpc.TransportError: ('Error connecting to server', ClientConnectorError(111, "Connect call failed ('192.168.1.102', 9090)"))
2019-10-17 18:22:26 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_update - shedule async_ws_connect
2019-10-17 18:22:26 DEBUG (MainThread) [custom_components.kodi_custom.media_player] async_ws_connect
2019-10-17 18:22:37 DEBUG (MainThread) [custom_components.kodi_custom.media_player] Sending heartbeat.

An exception occured when HA attempted a connection to Kodi and the link was still broken between Kodi and HA ... More testing/debug needs to be done, but we are close to a solution.

Elkropac commented 5 years ago

After upgrading HA to 0.100.2, it started to work without any other change. I have set up nodered flow, to send me notifications on kodi state change and i receive notifications only when changing playback state, i does not loose link now

plapointe6 commented 5 years ago

Upgrading to 0.100.2 did not resolve the issue discussed here, so I guess that your issue was not the same.

stale[bot] commented 4 years ago

There hasn't been any activity on this issue recently. Due to the high number of incoming GitHub notifications, we have to clean some of the old issues, as many of them have already been resolved with the latest updates. Please make sure to update to the latest Home Assistant version and check if that solves the issue. Let us know if that works for you by adding a comment 👍 This issue now has been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

Elkropac commented 4 years ago

Hi, after upgrading to 0.104.3, problem seems to be back for me

tadly commented 4 years ago

It is not "back", it has never been fixed in the first place