Open francistogram opened 8 months ago
My guess is that it's not related to the last error and my suspicion is that it's actually related to the timeout given that some other jobs scheduled to run every 5 minutes e.g.
vIehCTtgKRhy6A9C2Tr8
EWlEr0PHHPbqKIudxky2
Did not have any issues
Seems like this execution at 6:55am CST was running for 943s or 15m and 43s
I checked the vercel logs for the endpoint and don't see anything on my side and the other interesting thing is that I have my timeout for these serverless endpoints to be 4 minutes and the max is 5 minutes that I'm not sure how it could've hung for 15m 43s
Thank you for the detailed report! Investigating the issue and will get back to you asap.
Hi @francistogram. Looking closer at the logs, it seems like the timeout indeed caused the issues here.
What is important to note is this was not a timeout on your function, but a network timeout. The destination URL could not be reached at all (hard to know the reason, can be DNS, CDN or maybe something simpler like a cold start or deployment). This would explain also why it's not visible in the Vercel logs.
Either way, Crontap waits for a response for up to 1h, which means a long-running request could potentially overlap with future schedules. At the moment
From our logs it seems like the job at 12:55:00 ran until 13:10:54. Without going much into detail, there is some investigation work attached below.
A potential solution here is to optionally allow customizing the maximum wait for each schedule. Here a ~5min wait before the request is abandoned should have prevented issues. This could also be set automatically based on the schedule interval, but I wonder if setting that automatically would cause other problems on it's own.
Either way, allowing this to be set manually should be a sensible option, albeit advanced.
The maximum wait time sounds like a reasonable solution to me!
The destination URL could not be reached at all (hard to know the reason, can be DNS, CDN or maybe something simpler like a cold start or deployment)
I'll reach out to Vercel and see if they have any more context on this
Thanks for digging into this issue 🙏
Not sure if this is related to the last issue or not but something weird is going on this morning for schedule
8qhW7k67PONn0IQGzzeE
which is set to run every 5 minutes6:50am CST works fine
6:55am CST works fine (failure is something on my side)
Then it skips 7am and runs at 7:04:56am so likely the job was delayed
Runs at 7:05am
Then something weird happens again
Runs at 7:10:01 which is correct
But also runs at 7:10:56 which doesn't make any sense
Any idea what's going on here @danmindru?
Not sure if related to the issue from last week https://github.com/Crontap/crontap/issues/9