Open eu9ene opened 6 months ago
landing #561 and setting up dashboards for CPU machines can help with that
I don't think there's anything we can do to make this better in this repo nor taskgraph. This is a worker issue that's been filed as https://github.com/taskcluster/taskcluster/issues/6894
Sometimes we run into OOM and it's hard to say from the logs that it's the case. It looks like a preemption of a spot instance. We should be able to easily identify that the task was terminated because the machine was out of memory.