home-assistant / core

:house_with_garden: Open source home automation that puts local control and privacy first.
https://www.home-assistant.io
Apache License 2.0
70.14k stars 29.18k forks source link

Complete freeze of homeassistant after activating Nabu Casa #23357

Closed scstraus closed 5 years ago

scstraus commented 5 years ago

Home Assistant release with the issue:

90.2

Last working Home Assistant release (if known): This is only version I've tried Nabu Casa on

Operating environment (Hass.io/Docker/Windows/etc.):

Hass.io running on Ubuntu Server on amd64 (2009 mac mini)

Component/platform:

Nabu Casa

Description of problem:

After trying out nabu casa, have been getting random freezes of home assistant where it completely stops doing anything, logging, etc.

Problem-relevant configuration.yaml entries and (fill out even if it seems unimportant):

cloud:

Traceback (if applicable):

2019-04-24 22:25:11 ERROR (MainThread) [hass_nabucasa.remote] Can't update remote details from Home Assistant cloud
2019-04-24 22:25:39 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/aiohttp/http_websocket.py", line 632, in ping
    await self._send_frame(message, WSMsgType.PING)
  File "/usr/local/lib/python3.7/site-packages/aiohttp/http_websocket.py", line 614, in _send_frame
    self.transport.write(header + message)
  File "uvloop/handles/stream.pyx", line 671, in uvloop.loop.UVStream.write
  File "uvloop/handles/handle.pyx", line 159, in uvloop.loop.UVHandle._ensure_alive
RuntimeError: unable to perform operation on <TCPTransport closed=True reading=False 0x56014e5f4ab8>; the handler is closed
2019-04-24 22:25:40 DEBUG (SyncWorker_5) [botocore.utils] Caught exception while trying to retrieve credentials: HTTPConnectionPool(host='[IP_REDACTED]', port=80): Max retries exceeded with url: /latest/meta-data/iam/security-credentials/ (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPConnection object at 0x7f43569bca90>, 'Connection to [IP_REDACTED] timed out. (connect timeout=1)'))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 134, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/util/connection.py", line 88, in create_connection
    raise err
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/util/connection.py", line 78, in create_connection
    sock.connect(sa)
socket.timeout: timed out

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
    body=body, headers=headers)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 349, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/site-packages/botocore/awsrequest.py", line 130, in _send_request
    self, method, url, body, headers, *args, **kwargs)
  File "/usr/local/lib/python3.7/http/client.py", line 1275, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/site-packages/botocore/awsrequest.py", line 157, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/site-packages/botocore/awsrequest.py", line 242, in send
    return HTTPConnection.send(self, str)
  File "/usr/local/lib/python3.7/http/client.py", line 956, in send
    self.connect()
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 155, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 139, in _new_conn
    (self.host, self.timeout))
botocore.vendored.requests.packages.urllib3.exceptions.ConnectTimeoutError: (<botocore.awsrequest.AWSHTTPConnection object at 0x7f43569bca90>, 'Connection to [IP_REDACTED] timed out. (connect timeout=1)')

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/adapters.py", line 370, in send
    timeout=timeout
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 597, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/util/retry.py", line 271, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
botocore.vendored.requests.packages.urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='[IP_REDACTED]', port=80): Max retries exceeded with url: /latest/meta-data/iam/security-credentials/ (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPConnection object at 0x7f43569bca90>, 'Connection to [IP_REDACTED] timed out. (connect timeout=1)'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/botocore/utils.py", line 174, in _get_request
    response = requests.get(url, timeout=timeout)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/adapters.py", line 419, in send
    raise ConnectTimeout(e, request=request)
botocore.vendored.requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='[IP_REDACTED]', port=80): Max retries exceeded with url: /latest/meta-data/iam/security-credentials/ (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPConnection object at 0x7f43569bca90>, 'Connection to [IP_REDACTED] timed out. (connect timeout=1)'))

Additional information:

About 2-3 log messages later, the whole thing is completely frozen.

ghost commented 5 years ago

Hey there @home-assistant/core, mind taking a look at this issue as its been labeled with a integration (cloud) you are listed as a codeowner for? Thanks!

This is a automatic comment generated by codeowners-mention to help ensure issues and pull requests are seen by the right people.

awarecan commented 5 years ago
botocore.vendored.requests.exceptions.ConnectTimeout: HTTPConnectionPool(host='[IP_REDACTED]', port=80): Max retries exceeded with url: /latest/meta-data/iam/security-credentials/ (Caused by ConnectTimeoutError(<botocore.awsrequest.AWSHTTPConnection object at 0x7f43569bca90>, 'Connection to [IP_REDACTED] timed out. (connect timeout=1)'))

Please do not redact this IP address. Your instance has problem connect to Amazon Web Service used by Nabu Casa.

scstraus commented 5 years ago

Sorry, I didn't want to bother cross referencing with all possible publicly identifiable addresses for me.. Here it is: 169.254.169.254. Same IP in all messages.

awarecan commented 5 years ago

That is not your address.

https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-metadata.html

pvizeli commented 5 years ago

Can I have the full log with debug mode? Can you post a list of all components they you run?

You are not login to remote, because they can't reach out the connection endpoint. So it doesn't freeze because you are login.

Look like a network Issue. The WebSocket error shows the same. Does run anything other stuff on that server? Which country to the instance running?

Can you post the output of: https://www.cloudping.info/

The first two lines first are the interest one:

2019-04-24 22:25:11 ERROR (MainThread) [hass_nabucasa.remote] Can't update remote details from Home Assistant cloud
2019-04-24 22:25:39 ERROR (MainThread) [homeassistant.core] Error doing job: Task exception was never retrieved

First one is: you can't connect to AWS. Second one: the core websocket for UI have an issue to your normal client on a direct connection.

scstraus commented 5 years ago

It's worth noting that I haven't had any network/internet issues in months, and that during this crash there was no problem with the internet whatsoever. I was running with zero connection issues the last 6 months over an SSL connection without Nabu Casa and it was fine, problems started right after activating Nabu Casa. I do run other things on the server, but again there was zero issue before nabu casa.

Here's the log from now (not same as one with errors I sent you though):

My config is available here, recent enough to know what components I'm running. https://github.com/scstraus/home-assistant-config

Here's cloudping output (internet connection right now is definitely worse than when I had the issue I sent). I'm located in Prague, Czech Republic.

Region Latency
US-East (Virginia) 162 ms
US East (Ohio) 180 ms
US-West (California) 194 ms
US-West (Oregon) 456 ms
Canada (Central) 763 ms
Europe (Ireland) 73 ms
Europe (London) 87 ms
Europe (Frankfurt) 94 ms
Europe (Paris) 94 ms
Europe (Stockholm) 100 ms
Asia Pacific (Mumbai) 153 ms
Asia Pacific (Osaka-Local) 256 ms
Asia Pacific (Seoul) 285 ms
Asia Pacific (Singapore) 207 ms
Asia Pacific (Sydney) 1083 ms
Asia Pacific (Tokyo) 304 ms
South America (São Paulo) 318 ms
China (Beijing) 287 ms
China (Ningxia) 302 ms
AWS GovCloud (US-East) 170 ms
AWS GovCloud (US) 257 ms

(I'd say this is a worst case scenario because it's heavily raining and I use a microwave internet connection). I'm pretty sure that when I turn off Nabu Casa, all these issues will stop.. I will probably have to do it soon as the freezing is starting to be annoying. Let me know soon if you want anything else and I will switch back to SSL.

pvizeli commented 5 years ago

Do you use the stream component? There is an issue with shell command they can block the instance.

Well, I can't see anything they going wrong on your logs. My first idea was that the socket block because of network error but on your first log, there was no nabucasa remote connection available like on your second log. Maybe you can confirm that, if you are connected to cloud but turn off the remote connection and it will hang too?

Can you try it with 0.92 ?

scstraus commented 5 years ago

No, I don't use the stream component. Actually today I started to notice periods in my speedtest sensor where my upload speeds were going down to just a few kbps. I contacted my ISP and they confirmed that they have been fighting off an attack the last few days and it's been leading to intermittent outages. I guess I just hadn't noticed it directly while using it.. It's strange that I was also getting hangs when accessing on the local network, but I suspect it was probably dropbox who had been eating a lot of CPU and memory, I suspect also due to the internet issues.. I will keep monitoring the situation and see if it continues. I'm spinning up a new synology and might move some of the stuff that's running on this machine over to there. I guess for now I will close this one, and if it starts happening again, I will try on the newest version and get back to you.

scstraus commented 5 years ago

More freezes today . I think it's worth noting a couple observations and thoughts I've had.

pvizeli commented 5 years ago

I think there can be 2 issues:

scstraus commented 5 years ago

Okay, that makes sense. I didn't realize that threads were in such limited supply. I stopped dropbox again and not long after, the ui returned (first time I've found such a direct cause-effect way to fix it).. So I suspect they are competing for resources, memory or network.. I am working on migrating the file serving functions off to another server which will free up a lot of resources and hopefully fix the problem.

scstraus commented 5 years ago

It had been working well after I had moved some stuff off the machine, but today I am able to access from the local IP address only. The nabu casa URL gives me:

This site can’t provide a secure connection

[REDACTED].ui.nabu.casa sent an invalid response.
ERR_SSL_PROTOCOL_ERROR

when connecting from a browser

And a connection error with the following error when connecting from the app:

Shared.TokenManager.TokenError error 1

Local host and port are fine. All was working yesterday and there’s been no config changes other than some Lovelace changes.

balloob commented 5 years ago

Go in the UI to configuration -> cloud and make sure that remote connection is enabled and available.

scstraus commented 5 years ago

Guys, I'm reopening this, as it happened again today, and I'm 100% sure there are no network issues now. Speedtest is showing 50mbit down and 10 mbit up, the server is reachable and the logs and all other components are running.. It's only the UI that isn't responsive, and I'm getting the errors below:

2019-05-15 23:05:47 INFO (MainThread) [homeassistant.components.http.view] Serving /api/websocket to 10.10.10.4 (auth: False)
2019-05-15 23:05:47 DEBUG (SyncWorker_5) [botocore.credentials] Looking for credentials via: ec2-credentials-file
2019-05-15 23:05:47 DEBUG (Dummy-5) [libopenzwave] notif_callback : new notification
2019-05-15 23:05:47 DEBUG (MainThread) [homeassistant.components.sensor.airvisual] New data retrieved: {'city': 'Prague', 'state': 'Praha', 'country': 'Czech Republic', 'location': {'type': 'Point', 'coordinates': [14.411826111111111, 50.042003611111106]}, 'current': {'weather': {'ts': '2019-05-15T20:00:00.000Z', 'hu': 93, 'ic': '10n', 'pr': 1020, 'tp': 6, 'wd': 350, 'ws': 4.6}, 'pollution': {'ts': '2019-05-15T19:00:00.000Z', 'aqius': 17, 'mainus': 'o3', 'aqicn': 14, 'maincn': 'o3'}}}
2019-05-15 23:05:47 DEBUG (SyncWorker_5) [botocore.credentials] Looking for credentials via: boto-config
2019-05-15 23:05:47 DEBUG (Dummy-5) [libopenzwave] notif_callback : Notification type : 2, nodeId : 6
2019-05-15 23:05:48 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=zwave.aeotec_zw096_smart_switch_6, old_state=<state zwave.aeotec_zw096_smart_switch_6=ready; node_id=6, node_name=Aeotec ZW096 Smart Switch 6, manufacturer_name=Aeotec, product_name=ZW096 Smart Switch 6, query_stage=Complete, is_awake=True, is_ready=True, is_failed=False, is_info_received=True, max_baud_rate=40000, is_zwave_plus=True, capabilities={'beaming', 'listening', 'routing', 'zwave_plus'}, neighbors={1}, sentCnt=38, sentFailed=0, retries=2, receivedCnt=7059, receivedDups=86, receivedUnsolicited=7014, sentTS=2019-05-15 16:35:21:869 , receivedTS=2019-05-15 23:05:16:660 , lastRequestRTT=97, averageRequestRTT=68, lastResponseRTT=111, averageResponseRTT=75, friendly_name=Aeotec ZW096 Smart Switch 6 @ 2019-05-15T12:22:38.421331+02:00>, new_state=<state zwave.aeotec_zw096_smart_switch_6=ready; node_id=6, node_name=Aeotec ZW096 Smart Switch 6, manufacturer_name=Aeotec, product_name=ZW096 Smart Switch 6, query_stage=Complete, is_awake=True, is_ready=True, is_failed=False, is_info_received=True, max_baud_rate=40000, is_zwave_plus=True, capabilities={'beaming', 'listening', 'routing', 'zwave_plus'}, neighbors={1}, sentCnt=38, sentFailed=0, retries=2, receivedCnt=7061, receivedDups=86, receivedUnsolicited=7016, sentTS=2019-05-15 16:35:21:869 , receivedTS=2019-05-15 23:05:35:599 , lastRequestRTT=97, averageRequestRTT=68, lastResponseRTT=111, averageResponseRTT=75, friendly_name=Aeotec ZW096 Smart Switch 6 @ 2019-05-15T12:22:38.421331+02:00>>
2019-05-15 23:05:49 DEBUG (SyncWorker_5) [botocore.credentials] Looking for credentials via: container-role
2019-05-15 23:05:49 DEBUG (Dummy-5) [libopenzwave] addValueId : ValueID : 72057594143605248
2019-05-15 23:05:49 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=switch.aeotec_zw096_smart_switch_6_switch, old_state=<state switch.aeotec_zw096_smart_switch_6_switch=on; node_id=6, value_index=0, value_instance=1, value_id=72057594143391744, power_consumption=0.0, friendly_name=Aeotec ZW096 Smart Switch 6 Switch @ 2019-05-15T12:21:28.700381+02:00>, new_state=<state switch.aeotec_zw096_smart_switch_6_switch=on; node_id=6, value_index=0, value_instance=1, value_id=72057594143391744, power_consumption=1.142, friendly_name=Aeotec ZW096 Smart Switch 6 Switch @ 2019-05-15T12:21:28.700381+02:00>>
2019-05-15 23:05:49 DEBUG (SyncWorker_5) [botocore.credentials] Looking for credentials via: iam-role
2019-05-15 23:05:49 DEBUG (Dummy-5) [libopenzwave] addValueId : GetCommandClassId : 50, GetType : 0
2019-05-15 23:05:49 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=sensor.aeotec_zw096_smart_switch_6_energy, old_state=<state sensor.aeotec_zw096_smart_switch_6_energy=116.8; node_id=6, value_index=0, value_instance=1, value_id=72057594143604738, power_consumption=0.0, unit_of_measurement=kWh, friendly_name=Miele Washer Energy, icon=mdi:washing-machine @ 2019-05-15T23:05:07.362604+02:00>, new_state=<state sensor.aeotec_zw096_smart_switch_6_energy=116.8; node_id=6, value_index=0, value_instance=1, value_id=72057594143604738, power_consumption=1.142, unit_of_measurement=kWh, friendly_name=Miele Washer Energy, icon=mdi:washing-machine @ 2019-05-15T23:05:49.943537+02:00>>
2019-05-15 23:05:50 DEBUG (Thread-3) [sseclient] Dispatching keep-alive event, 4 bytes...
2019-05-15 23:05:50 INFO (SyncWorker_5) [botocore.vendored.requests.packages.urllib3.connectionpool] Starting new HTTP connection (1): 169.254.169.254
2019-05-15 23:05:50 DEBUG (Dummy-5) [libopenzwave] addValueId : Notification : {'notificationType': 'ValueChanged', 'homeId': 3560018997, 'nodeId': 6, 'valueId': {'homeId': 3560018997, 'nodeId': 6, 'commandClass': 'COMMAND_CLASS_METER', 'instance': 1, 'index': 32, 'id': 72057594143605248, 'genre': 'User', 'type': 'Bool', 'value': False, 'label': 'Exporting', 'units': '', 'readOnly': True}}
2019-05-15 23:05:51 DEBUG (Dummy-5) [libopenzwave] notif_callback : call callback context
2019-05-15 23:05:51 DEBUG (Thread-3) [nest.nest] <<< keep-alive event
2019-05-15 23:05:51 DEBUG (Dummy-5) [openzwave] zwcallback args=[{'notificationType': 'ValueChanged', 'homeId': 3560018997, 'nodeId': 6, 'valueId': {'homeId': 3560018997, 'nodeId': 6, 'commandClass': 'COMMAND_CLASS_METER', 'instance': 1, 'index': 32, 'id': 72057594143605248, 'genre': 'User', 'type': 'Bool', 'value': False, 'label': 'Exporting', 'units': '', 'readOnly': True}}]
2019-05-15 23:05:51 DEBUG (MainThread) [homeassistant.core] Bus:Handling <Event state_changed[L]: entity_id=sensor.aeotec_zw096_smart_switch_6_previous_reading, old_state=<state sensor.aeotec_zw096_smart_switch_6_previous_reading=115.86; node_id=6, value_index=1, value_instance=1, value_id=72057594143604754, power_consumption=0.0, unit_of_measurement=kWh, friendly_name=Aeotec ZW096 Smart Switch 6 Previous Reading @ 2019-05-15T23:05:08.414864+02:00>, new_state=<state sensor.aeotec_zw096_smart_switch_6_previous_reading=115.86; node_id=6, value_index=1, value_instance=1, value_id=72057594143604754, power_consumption=1.142, unit_of_measurement=kWh, friendly_name=Aeotec ZW096 Smart Switch 6 Previous Reading @ 2019-05-15T23:05:51.049385+02:00>>
2019-05-15 23:05:51 DEBUG (SyncWorker_5) [botocore.utils] Caught exception while trying to retrieve credentials: ('Connection aborted.', OSError(101, 'Network unreachable'))
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
    body=body, headers=headers)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 349, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/site-packages/botocore/awsrequest.py", line 130, in _send_request
    self, method, url, body, headers, *args, **kwargs)
  File "/usr/local/lib/python3.7/http/client.py", line 1275, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/site-packages/botocore/awsrequest.py", line 157, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/site-packages/botocore/awsrequest.py", line 242, in send
    return HTTPConnection.send(self, str)
  File "/usr/local/lib/python3.7/http/client.py", line 956, in send
    self.connect()
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 155, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 134, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/util/connection.py", line 88, in create_connection
    raise err
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/util/connection.py", line 78, in create_connection
    sock.connect(sa)
OSError: [Errno 101] Network unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/adapters.py", line 370, in send
    timeout=timeout
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 597, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/util/retry.py", line 245, in increment
    raise six.reraise(type(error), error, _stacktrace)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/packages/six.py", line 309, in reraise
    raise value.with_traceback(tb)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 544, in urlopen
    body=body, headers=headers)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 349, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/lib/python3.7/http/client.py", line 1229, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/usr/local/lib/python3.7/site-packages/botocore/awsrequest.py", line 130, in _send_request
    self, method, url, body, headers, *args, **kwargs)
  File "/usr/local/lib/python3.7/http/client.py", line 1275, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/http/client.py", line 1224, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/usr/local/lib/python3.7/site-packages/botocore/awsrequest.py", line 157, in _send_output
    self.send(msg)
  File "/usr/local/lib/python3.7/site-packages/botocore/awsrequest.py", line 242, in send
    return HTTPConnection.send(self, str)
  File "/usr/local/lib/python3.7/http/client.py", line 956, in send
    self.connect()
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 155, in connect
    conn = self._new_conn()
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/connection.py", line 134, in _new_conn
    (self.host, self.port), self.timeout, **extra_kw)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/util/connection.py", line 88, in create_connection
    raise err
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/packages/urllib3/util/connection.py", line 78, in create_connection
    sock.connect(sa)
botocore.vendored.requests.packages.urllib3.exceptions.ProtocolError: ('Connection aborted.', OSError(101, 'Network unreachable'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/botocore/utils.py", line 174, in _get_request
    response = requests.get(url, timeout=timeout)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/api.py", line 69, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/api.py", line 50, in request
    response = session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/sessions.py", line 465, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/sessions.py", line 573, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/botocore/vendored/requests/adapters.py", line 415, in send
    raise ConnectionError(err, request=request)
botocore.vendored.requests.exceptions.ConnectionError: ('Connection aborted.', OSError(101, 'Network unreachable'))
scstraus commented 5 years ago

You know what, it might be a memory issue as I've only got 76k free and python seems to be really going crazy on resources.. I started a check config and everything completely froze up right after.. Some memory leak?

op - 21:15:26 up 9 days,  1:03,  1 user,  load average: 3.49, 2.82, 2.77
Tasks: 171 total,   1 running, 122 sleeping,   0 stopped,   0 zombie
%Cpu(s):  8.4 us,  4.4 sy,  0.0 ni, 19.1 id, 67.6 wa,  0.0 hi,  0.5 si,  0.0 st
KiB Mem :  1778000 total,    91036 free,  1549512 used,   137452 buff/cache
KiB Swap:  2097148 total,  1043892 free,  1053256 used.    83504 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND              
 2771 root      20   0  277236   3084   2348 S   7.0  0.2   3:10.76 smbd                 
 5299 root      20   0 1944960 1.257g   5312 S   5.0 74.2  22:03.91 python3              
 1204 root      20   0 1335316  31080   8396 S   1.3  1.7  31:33.92 dockerd              
 2615 root      20   0  525444   6056    972 S   1.0  0.3  41:12.94 mysqld               
 6831 root      20   0   24596   7436   1096 S   0.7  0.4   0:15.20 python3              
31821 scstraus  20   0   42788   3808   3152 R   0.7  0.2   0:00.29 top                  
  304 root      20   0       0      0      0 S   0.3  0.0   4:41.64 jbd2/sda2-8          
 3023 root      20   0   10732    688    228 S   0.3  0.0   5:38.47 containerd-shim      
    1 root      20   0  225768   4660   3640 S   0.0  0.3   0:14.45 systemd