zKillboard / RedisQ

A simple queue service for "pushing" killmails from zKillboard.
https://redisq.zkillboard.com/listen.php
Other
37 stars 4 forks source link

Many timeouts after API rewrite #20

Closed ErikKalkoken closed 11 months ago

ErikKalkoken commented 11 months ago

AA-Killtracker has been updated and is now also using the QueueID in all request.

However, we are seeing a lot of read timeouts with the new implementation of the RedisQ API. Some request are going through, but many run into the timeout. Some of my users are reporting they do not get any successful requests, just timeouts.

Example:

Traceback (most recent call last):
File "/home/allianceserver/venv/auth/lib/python3.10/site-packages/celery/app/trace.py", line 477, in trace_task
R = retval = fun(*args, **kwargs)
File "/home/allianceserver/venv/auth/lib/python3.10/site-packages/celery/app/trace.py", line 760, in __protected_call__
return self.run(*args, **kwargs)
File "/home/allianceserver/venv/auth/lib/python3.10/site-packages/killtracker/tasks.py", line 53, in run_killtracker
killmail = Killmail.create_from_zkb_redisq()
File "/home/allianceserver/venv/auth/lib/python3.10/site-packages/killtracker/core/killmails.py", line 411, in create_from_zkb_redisq
response = requests.get(
File "/home/allianceserver/venv/auth/lib/python3.10/site-packages/requests/api.py", line 73, in get
return request("get", url, params=params, **kwargs)
File "/home/allianceserver/venv/auth/lib/python3.10/site-packages/requests/api.py", line 59, in request
return session.request(method=method, url=url, **kwargs)
File "/home/allianceserver/venv/auth/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/home/allianceserver/venv/auth/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/home/allianceserver/venv/auth/lib/python3.10/site-packages/requests/adapters.py", line 532, in send
raise ReadTimeout(e, request=request)
requests.exceptions.ReadTimeout: HTTPSConnectionPool(host='redisq.zkillboard.com', port=443): Read timed out. (read timeout=30)
ErikKalkoken commented 11 months ago

Here a current chart of these timeouts occurring over the last 24 hrs.

Screenshot from 2023-11-24 18-25-07

cvweiss commented 11 months ago

Earlier I identified an OS level issue with packets being dropped and have addressed it, hopefully, this prevents the timeouts from continuing.

cvweiss commented 11 months ago

@ErikKalkoken how are your charts looking?

ErikKalkoken commented 11 months ago

It looks good now. The timeout issue stopped occurring after yesterday around 20:00 UTC. Thanks a lot for your quick help!

Screenshot from 2023-11-25 16-47-26

cvweiss commented 11 months ago

Excellent, I'll mark the issue as closed. Those new lines are likely from me bouncing the server while looking at another issue.