rust-lang / jobserver-rs

Apache License 2.0
68 stars 42 forks source link

Optionally increase the size of the pipe buffer #34

Closed alexcrichton closed 3 years ago

alexcrichton commented 3 years ago

This commit attempts to address rust-lang/cargo#9739 by optionally increasing the capacity of pipes created on Linux. This may be required if the pipe initially starts out with only one page of buffer which apparently means that normal jobserver usage might cause the pipe to deadlock where all writers are blocked.

alexcrichton commented 3 years ago

@ehuss mind taking a look at this and https://github.com/rust-lang/cargo/issues/9739 and reviewing?

I think that the best option is to go ahead with a fix like this. I don't think that this would actually have fixed the issue in https://github.com/rust-lang/cargo/issues/9739 other than reporting a better error earlier on instead of deadlocking at some point. I suspect that if a pipe's capacity is a single page that we'll always hit the error case when we try to increase its capacity, but I also figure it's not the end of the world to go ahead and and land this anyway since it's at least a better error and in theory protects us from obscure linux configurations as well.

alexcrichton commented 3 years ago

I poked around searching a bit trying to find other instances of this, but the closest I can see is https://stackoverflow.com/questions/47445540/gnu-make-hangs-on-pipe-write. That issue (unanswered, from 2017) looks like it's exactly what we're encountering.

All-in-all I have no idea why this isn't more prevalent with make. It may be easier to attribute "make is stuck" perhaps to "I have 200 makefiles and maybe there's a bug" rather than "cargo got stuck" though. It also seems to only be an issue when the pipe capacity is one page, which is not the default on Linux and according to the original reporter only happens in high-load situations when there's a lot of other pipes active in the system. That I think means that the conditions for a bug like this to happen seem like they're pretty rare, and it certainly seems like there's something racing something. That, coupled with Cargo/rustc's usage of jobservers may be slightly different than make's, may explain why this hasn't been more prevalent?