The crash came at a surge in traffic, which peaked at 41 req/s to this single-core instance (significantly in excess of the 2 * n_cpu advice I've seen given in other issues). The reason for the concurrency is that I am currently scaling based on target CPU utilization rather than number of connections (and CPU utilization maxed out at < 40% during this surge). Let me know if you think this is unwise / could be the cause. If it is, I wonder if you have any advice on how I can run imagor in such a way that allows for higher CPU utilization?
As an experiment, I am running on Ampere Altra-based ARM VMs (which required me to build the imagor Docker container for the linux/arm64 platform). While things generally appear to be working as expected, I'm noting it here in case it raises any red flags.
The imagor service has been crashing periodically with the following error message:
A couple of relevant notes: