rust-lang / jobserver-rs

Apache License 2.0
69 stars 40 forks source link

Handle EINTR. #20

Closed emilio closed 4 years ago

emilio commented 4 years ago

When building with sccache locally, I'm seeing a bunch of errors to acquire the jobserver token which are caused by interrupts at bad times.

I think it was a kernel update what caused this, fwiw, but not 100% sure.

Both can happen according to the man pages, and we're not handling them sensibly.

I think retrying the read / poll when the error is EINTR is a sensible thing to do.

emilio commented 4 years ago

With this patch so far I'm able to build Firefox with sccache-dist on a row. Before this, I've needed always a couple restarts per build to deal with this.

This may have something to do with the pipe changes / make bug described in here: https://lkml.org/lkml/2019/12/7/173

alexcrichton commented 4 years ago

Thanks! Instead of introducing nested loops could these continue to the outer loop?

emilio commented 4 years ago

Thanks! Instead of introducing nested loops could these continue to the outer loop?

I don't think that's equivalent right? If the read gets interrupted we want to re-trigger the read, not the poll.

emilio commented 4 years ago

(FWIW, I've been building with a patched sccache for the whole day and had zero issues)

alexcrichton commented 4 years ago

While not literally equivalent it's morally equivalent, I don't mean to contradict the correctness of this patch! Only that I think it would be cleaner to only have one loop and we continue to that loop. While it's technically more optimal to avoid a poll after the read has been interrupted I don't think it really matters in terms of perf, so I would prefer to err on the side of more readable style.

emilio commented 4 years ago

Yeah, retrying the poll instead of the read would be equivalent if there's data to read I guess, so agreed, done :)

alexcrichton commented 4 years ago

Thanks!