Problem:
Certain conditions, such as when large numbers of SeAT notification jobs fire in a short period of time utilizing Discord webhooks, can trigger Discord's per-route rate limiting, causing POSTs to fail with a 429 Too Many Requests which can lead to a MaxAttemptsExceededException SeAT exception.
This causes scenarios where large numbers of notification jobs may end up in a failed state and notifications will never be sent which could be risky depending on the severity of the notification (e.g. structure reinforcement, members leaving corps, etc). Current rate limits in SeAT appear to be set to 45/minute to avoid the documented global rate limit of 50/minute, but Discord also has an additional per-route rate limit value that is not specifically documented but is mentioned. This likely means that if a certain number of notifications that produce a Discord message configured to use the same webhook fire at or near the same time, jobs can fail. When this occurs enough times to cause a job failure, no message is sent to the Discord channel, the job retries a few times until it fails, and effectively the notification is lost.
Expected: Rate limits are handled dynamically via header as per the Discord documentation to avoid rate limit conditions or adjusted further to avoid potential conditions.
In the interim, adjusting the rate limiting settings to ensure a low-ish per second maximum, some online forums say anywhere from 3-5 per second max, in addition to the 45/minute may be enough to mitigate the issue.
Any limits hardcoded in SeAT would not apply to any other software running on the same system, due to global rate limiting occurring on a per-IP basis. While this is not the case in my instance as no other software on this machine is using Discord webhooks, it may provide clarity whether to determine if rate limit avoidance via header is best rather than adjusting the limits as-is.
Note: This webhook is only being used for SeAT - it is not re-used for any other process.
First image shows API return hitting the rate limit and indicates that it is not the global limit - hence why I believe this is related to a per-route limit:
This image shows behavior once the job fails enough times, rendering the job permanently failed (no notification):
This image shows a number of failures around the same timeframe, two are pictured above. (Note: Items with "3" attempts are 429 Too Many Requests failures, items with "4" attempts are MaxAttemptsExceededException.
Version Info: SeAT v5, all modules up-to-date as of 2024-07-31.
Problem: Certain conditions, such as when large numbers of SeAT notification jobs fire in a short period of time utilizing Discord webhooks, can trigger Discord's per-route rate limiting, causing POSTs to fail with a
429 Too Many Requests
which can lead to aMaxAttemptsExceededException
SeAT exception.This causes scenarios where large numbers of notification jobs may end up in a failed state and notifications will never be sent which could be risky depending on the severity of the notification (e.g. structure reinforcement, members leaving corps, etc). Current rate limits in SeAT appear to be set to 45/minute to avoid the documented global rate limit of 50/minute, but Discord also has an additional per-route rate limit value that is not specifically documented but is mentioned. This likely means that if a certain number of notifications that produce a Discord message configured to use the same webhook fire at or near the same time, jobs can fail. When this occurs enough times to cause a job failure, no message is sent to the Discord channel, the job retries a few times until it fails, and effectively the notification is lost.
Discord Rate Limiting Documentation: https://discord.com/developers/docs/topics/rate-limits
Expected: Rate limits are handled dynamically via header as per the Discord documentation to avoid rate limit conditions or adjusted further to avoid potential conditions.
Code sections that appear to be configured to handle Discord rate limiting: https://github.com/eveseat/notifications/blob/5.0.x/src/Notifications/AbstractDiscordNotification.php#L32 https://github.com/eveseat/notifications/blob/5.0.x/src/NotificationsServiceProvider.php#L226
Logs / Screenshots / Proof:
Note: This webhook is only being used for SeAT - it is not re-used for any other process.
First image shows API return hitting the rate limit and indicates that it is not the global limit - hence why I believe this is related to a per-route limit:
This image shows behavior once the job fails enough times, rendering the job permanently failed (no notification):
This image shows a number of failures around the same timeframe, two are pictured above. (Note: Items with "3" attempts are
429 Too Many Requests
failures, items with "4" attempts areMaxAttemptsExceededException
.Version Info: SeAT v5, all modules up-to-date as of 2024-07-31.