Set timeout on jenkins job

Shopify / yjit-metrics

"Tasks for benchmarking, building and collecting stats for YJIT"

MIT License

14 stars 9 forks source link

Set timeout on jenkins job #223

Closed rwstauner closed 9 months ago

rwstauner commented 9 months ago

A build got stuck for multiple days, we should add a timeout to the configurable run job (doesn't necessarily need to be configurable itself). Maybe 24h? or 12h?

noahgibbs commented 9 months ago

12 hours sounds a bit tight to me -- you might genuinely want to test something that runs that long. But it does have the advantage that you won't wind up scheduling multiple runs that step on each other. Up to you.

noahgibbs commented 9 months ago

Possibly relevant: @maximecb has talked about moving to a more expensive ARM instance than the ancient Gravitons we currently test on. Right now our run length is very much determined by those Graviton instances. x86 finishes much faster. So if that changes, 12 hours could become luxuriously long compared to the run length.

maximecb commented 9 months ago

I think a timeout is a good idea

There are also other tricks we can play. We could have a timeout for each benchmark run in the harness itself. That is, assuming that's not already the case, we allow up to a maximum of one minute for warmup, and another minute for benchmarking (or twice that much). It would increase the standard deviation a bit, but it would ensure that the benchmark runs all terminate relatively quickly, even on older machines.

It's more important to ensure that we get fresh results quickly than to have the most accurate measurement possible.

rwstauner commented 9 months ago

I proposed a simple PR that puts a 12 hour timeout on the job since the average seems to be about 9. If we change hardware we can easily adjust.

It's not clear where the last one got stuck so it's hard to say if this would fix it. It seemed to be between steps so while putting timeouts into the benchmark scripts may be something we want to do, it probably wouldn't have helped the problem we saw last time.