rust-lang / crates-io-cargo-teams

the home of the crates io team
5 stars 11 forks source link

Consider Using a WebSocket for Publish Endpoint #57

Open jtgeibel opened 4 years ago

jtgeibel commented 4 years ago

So this may be a crazy idea, and it would probably take some effort to work into our existing middleware on the crates.io side, but it may be path to removing a few papercuts in the publishing workflow. Those issues are:

The idea occurred to me when I was reading the Heroku docs on timeouts. As long as the server starts to send some response data within the first 30 seconds, longer term connections can be maintained. Typically, the server doesn't know the correct HTTP status code to respond with until the crate is accepted (all checks passed and added to the background queue). It isn't always possible to send some response data within the initial timeout.

However, if new clients request a connection upgrade, the server can respond with a 101 header and switch to a WebSocket connection. The server could send a message when the crate is accepted, and then another one once the index update is complete. If there is an unexpected delay in running the background job, the server could send a more descriptive error message notifying the user that no further action is needed on their part, and then closing the connection.

Alternatives

The main alternative I'm aware of is to use a scheme where:

Drawbacks

The main drawback is that this adds complexity, especially in the server. The current design assumes a complete response value is returned up the middleware layers and there is no mechanism for maintaining a long term connection. It may be possible to bolt something on in a reasonable way, and it might be blocked by switching to hyper in production.

smarnach commented 4 years ago

Just for completeness, another alternative would be do move away from Heroku. I'm not saying we should, but I do notice that we are fighting Heroku's limitations rather often (e.g. having to move the whole app behind CloudFront, noisy neighbours, having to worry about the memory increase caused by Fasstboot, runtime configuration, 30-second limit, among other problems I currently don't remember).

Currently we don't have the capacity to move to anything else, neither would we want to, but I expect that the growing traffic will force us to move to a cheaper option within the next few years.

Nemo157 commented 4 years ago

Another, another alternative would be to send 100 Continue responses periodically while the upload is in progress (though this is also something I haven't seen supported by any Rust web servers since they all want a single response per-request).