Closed jsermer closed 3 years ago
Manually bumping the elasticsearch requirement in the manifest.json and restart ha (ha core restart) seems to have fixed my issue:
# docker exec -it homeassistant pip list|grep elastic
elasticsearch 7.8.0
elasticsearch-async 6.2.0
I performed a couple more restarts and was able to recreate the initial problem, so bumping the requirements wasn't a real solution. :(
The elasticsearch node is on the same switch, just on another VLAN and the timings aren't slow by any means:
>>> from datetime import datetime
>>> import time
>>> from elasticsearch import Elasticsearch
>>> es = Elasticsearch("http://10.X.X.X:9200")
>>> start = time.time(); es.info(); end = time.time(); print(end - start)
{'name': 'elasticsearch', 'cluster_name': 'synology', 'cluster_uuid': 'h4bS4pFUSfePLhUCySAaOw', 'version': {'number': '7.8.0', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '757314695644ea9a1dc2fecd26d1a43856725e65', 'build_date': '2020-06-14T19:35:50.234439Z', 'build_snapshot': False, 'lucene_version': '8.5.1', 'minimum_wire_compatibility_version': '6.8.0', 'minimum_index_compatibility_version': '6.0.0-beta1'}, 'tagline': 'You Know, for Search'}
0.01810288429260254
Elastic recently announced async is now native to the elasticsearch and the elasticsearch-async has been deprecated:
https://elasticsearch-py.readthedocs.io/en/master/async.html https://github.com/elastic/elasticsearch-py-async#python-elasticsearch-async-client
The only caveat being a prereq of ES7
Using the elasticsearch-async library as part of the custom component (instead of the elasticsearch library), yields similar results on my raspberry pi:
>>> from datetime import datetime
>>> import time
>>> import asyncio
>>> from elasticsearch_async import AsyncElasticsearch
>>> client = AsyncElasticsearch(hosts=['http://10.X.X.X:9200'])
>>> async def print_info():
... info = await client.info()
... print(info)
...
>>> loop = asyncio.get_event_loop()
>>> start = time.time(); loop.run_until_complete(print_info()); end = time.time(); print(end - start)
{'name': 'elasticsearch', 'cluster_name': 'synology', 'cluster_uuid': 'h4bS4pFUSfePLhUCySAaOw', 'version': {'number': '7.8.0', 'build_flavor': 'default', 'build_type': 'docker', 'build_hash': '757314695644ea9a1dc2fecd26d1a43856725e65', 'build_date': '2020-06-14T19:35:50.234439Z', 'build_snapshot': False, 'lucene_version': '8.5.1', 'minimum_wire_compatibility_version': '6.8.0', 'minimum_index_compatibility_version': '6.0.0-beta1'}, 'tagline': 'You Know, for Search'}
0.022830724716186523
>>> loop.run_until_complete(client.transport.close())
>>> loop.close()
Things seem to be slowly improving through the 0.113.X release cycle (currently on the 0.113.3 release). Might be because they continue to fix blocking tasks during startup?
I've noticed that upon an upgrade restart, the elastic integration works seemingly all of the time, but non-upgrade restarts seem to have trouble. That particular symptom seems to be improving. Recent log entry after non-upgrade restart (still see some timeout errors logged, but the elastic integration appears to be working):
2020-08-06 09:00:08 WARNING (MainThread) [homeassistant.loader] You are using a custom integration for elastic which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you experience issues with Home Assistant.
2020-08-06 09:00:28 WARNING (MainThread) [homeassistant.setup] Setup of elastic is taking over 10 seconds.
2020-08-06 09:00:56 WARNING (MainThread) [elasticsearch] PUT http://10.X.X.X:9200/active-hass-index-v4_1/_settings?preserve_existing=true [status:N/A request:12.050s]
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/elasticsearch_async/connection.py", line 98, in perform_request
response = yield from self.session.request(method, url, data=body, headers=headers)
File "/usr/local/lib/python3.8/site-packages/aiohttp/client.py", line 504, in _request
await resp.start(conn)
File "/usr/local/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 847, in start
message, payload = await self._protocol.read() # type: ignore # noqa
File "/usr/local/lib/python3.8/site-packages/aiohttp/streams.py", line 591, in read
await self._waiter
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/elasticsearch_async/connection.py", line 99, in perform_request
raw_data = yield from response.text()
File "/usr/local/lib/python3.8/site-packages/async_timeout/__init__.py", line 45, in __exit__
self._do_exit(exc_type)
File "/usr/local/lib/python3.8/site-packages/async_timeout/__init__.py", line 92, in _do_exit
raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError
2020-08-06 09:00:56 ERROR (MainThread) [custom_components.elastic] Error updating index ILM settings: ConnectionTimeout caused by - TimeoutError()
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/elasticsearch_async/connection.py", line 98, in perform_request
response = yield from self.session.request(method, url, data=body, headers=headers)
File "/usr/local/lib/python3.8/site-packages/aiohttp/client.py", line 504, in _request
await resp.start(conn)
File "/usr/local/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 847, in start
message, payload = await self._protocol.read() # type: ignore # noqa
File "/usr/local/lib/python3.8/site-packages/aiohttp/streams.py", line 591, in read
await self._waiter
asyncio.exceptions.CancelledError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/elasticsearch_async/connection.py", line 99, in perform_request
raw_data = yield from response.text()
File "/usr/local/lib/python3.8/site-packages/async_timeout/__init__.py", line 45, in __exit__
self._do_exit(exc_type)
File "/usr/local/lib/python3.8/site-packages/async_timeout/__init__.py", line 92, in _do_exit
raise asyncio.TimeoutError
asyncio.exceptions.TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/config/custom_components/elastic/es_index_manager.py", line 121, in _create_index_template
await client.indices.put_settings(
File "/usr/local/lib/python3.8/site-packages/elasticsearch_async/transport.py", line 149, in main_loop
status, headers, data = yield from connection.perform_request(
File "/usr/local/lib/python3.8/site-packages/elasticsearch_async/connection.py", line 110, in perform_request
raise ConnectionTimeout('TIMEOUT', str(e), e)
elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - TimeoutError()
@jsermer Thanks for taking the time to document your findings here. Sorry for not responding to you sooner. I haven't had much time recently to triage issues, but I'm hoping to get to this in the next couple of weeks
No problem at all....having the elastic integration is a 'nice to have' and really doesn't impact my home automation in any way. Let me know how I can help and what other information you might require. Thanks for responding!
@jsermer I know you said the package upgrades didn't actually solves your problem, but I have a pre-release version of 0.3.0 published here if you're able to check it out. I'm interested to see if you get a different stack trace.
This version also adds a timeout
setting, which might help your particular setup
@legrego Thanks, I updated to the latest beta version this morning via HACS but did not add a timeout parameter to the config. So far it's working as expected (but has been working pretty consistently throughout the 0.114 release). If I begin seeing timeouts, I'll add that config parameter and continue to test. Nice work!
@jsermer that's great, thanks for the quick feedback! Full disclosure, the default timeout in the ES client was 10 seconds, but the work I did in #114 defaults the timeout to 30 seconds if you don't define one yourself, so that extra time might be helping too.
@legrego very cool....it really only seemed to timeout upon startup, so even bumping that from 10 -> 30 makes a lot of sense. home assistant is trying to start a bunch of other things in parallel and that may have been causing the original timeout. Feel free to close this bug report and I'll re-open it if the new timeout mechanism doesn't satisfy my need.
Environment Home-Assistant version: 0.112.3 Elasticsearch version: 7.8.0
Relevant
configuration.yml
settings:Describe the bug I receive connection timeout messages most of the time, but occasionally, the integration works and logs data into elasticsearch properly.
Expected behavior No connection timeouts
I started seeing this issue when https://github.com/legrego/homeassistant-elasticsearch/issues/97 was implemented/resolved.
Additional context I am able to curl the url endpoint from within the homeassistant container:
Log messages: