ERROR:asyncio:Fatal error: protocol.data_received() call failed

trentasis commented 2 years ago

Describe the bug We're getting fatal errors that ends in timeouts problems, with too many processes on our nagios server and making it freeze.

To Reproduce The problem occurs in some machines in a specific domain, only in that domain, even if we have 2 more not giving errors.

Steps to reproduce the behavior: We run the wmicserver with debug, not as a service, and we have the following errors, repeated in loop:

ERROR:asyncio:Fatal error: protocol.data_received() call failed. protocol: <aiowmi.protocol.Protocol object at 0x7f2360bdad00> transport: <_SelectorSocketTransport fd=15 read=polling write=<idle, bufsize=0>> Traceback (most recent call last): File "/usr/lib/python3.9/asyncio/selector_events.py", line 870, in _read_ready__data_received self._protocol.data_received(data) File "/opt/aiowmi/aiowmi/protocol.py", line 88, in data_received req, data, = self._requests[self._buf.call_id], self._buf.data KeyError: 7

trentasis commented 2 years ago

We still have the same error, but we managed to fix the problem of the Nagios freezing.

We just increased threads and workers in wmicserver service, and we noticed that depending on the load, the more workers and threads you have, the better it works.

This is our actual working configuration for the service:

ExecStart=nice gunicorn --bind X.X.X.X:2313 --pythonpath '/opt/aiowmi,/opt/aiowmi/contrib/wmic_server' --workers 20 --threads 20 wmic_server:app ExecStop=/usr/bin/pkill -f "wmic_server:app"

riklempens commented 2 years ago

Thx for letting us know! We will update the documentation to address the potential Nagios freezing issues. If I may ask how many Windows hosts are you monitoring with your current configuration?

cesbit / aiowmi

ERROR:asyncio:Fatal error: protocol.data_received() call failed #23