Closed jtgeibel closed 5 years ago
Could we go further and switch the whole stack to Actix? Conduit is not actively developed which means that few people are familiar with it. Actix, on the other hand, is the most popular currently.
So I have looked into Actix, a bit. For the moment, I want to migrate away from a C based HTTP server to a Rust based implementation, without committing to a framework.
A few of the things I thought about while implementing the original PR:
hyper
in out tests. Switching to hyper
doesn't introduce any new dependencies and after updating the code to 0.12 I'm somewhat familiar with the API.conduit
long term. It doesn't seem to be maintained and it pulls in a lot of very old versions of crates. Eventually, I'd like to vendor (or replace) our conduit
dependencies and move more middleware higher up the stack into an async layer. (Short term, I'd like to see some conduit-*
releases that purge old deps for now. Last I looked, there are no other public dependencies on the conduit-*
crates, so I think we should vendor or take ownership of this code, temporarily, as we transition to something else.)hyper
has more frameworks that build on top of it.I published conduit-hyper (used in #1378) to temporarily fill the gap and transition away from our, currently unmaintained, C-based web server.
Status update: Last week @carols10cents and I looked at the logs to see if there are any cargo instances still hitting routes with //
. We didn't see any, but then realized that the currently deployed web server is normalizing these for us before they hit the application (and our logging). So this doesn't really show us anything and has me leaning towards option 2, where I add special handing for the //
prefix but don't implement full directory traversal logic.
I would really prefer that we do full URI normalization somewhere rather than just special casing //
. Handling things like ..
are laid out by the spec, and aren't something that should ever make it to our router, regardless of what clients are doing
The URL normalization was incorporated into the conduit-hyper
crate and hyper
can now be enabled in production by setting USE_HYPER=1
in the environment. Closing this issue.
Background
In #1378 we attempted to switch our web server from
civet
tohyper
. Once this was deployed, it was discovered thatcargo
API requests were in the form ofhttps://crates.io//api/v1/...
. Civet has some interesting logic that consolidates adjacent/
characters and removes all..
sequences that follow a/
. This code doesn't really do the equivalent of directory traversal, for example I believe/../
is converted to/
. The code definitely doesn't pop the last segment off of the path (but does avoid any allocations).Hyper does not do any such manipulation of the URL path, so all cargo requests were resulting in a 404 response because the router logic doesn't know how to handle the extra
/
at the beginning of the path. We rolled back the deploy and then reverted the code in #1476.There is 1 other interesting portion of code that I want to point out. In
conduit-static
there is logic that results in a 404 if the path contains..
. This should protect against any directory traversal attacks, but isn't exactly very precise.Updating the index configuration
Recently I noticed that the
config.json
file in the registry index included a trailing/
in the API endpoint definition. Since all versions ofcargo
append a/api/v1
to this value, the trailing slash was removed in rust-lang/crates.io-index#12. I'm not sure exactly under what conditions cargo updates the registry, but as clients sync with the registry they should begin sending API requests without the extra slash. It is possible that there are some commands (like search and yank) that don't update the index first, so some old clients may still issue requests with the extra slash. I think we could search the logs to see if there are any such requests remaining and determine if we need to add logic to the server to handle this.Where should this be handled?
When used as a reverse proxy,
nginx
doesn't modify the URL path to merge slashes or handle..
directory traversal. If it did, then the extra slashes would not have hit our server code in production. When serving files directly,nginx
does handle extra slashes and potential directory traversal attacks. I have not yet tested to see how application servers (such aspassenger
withnginx
) handle this.Since
hyper
may also be used by client code to create a proxy I'm not sure that it would be the proper place to address this. I believe this is probably something that should be handled at the framework level, but for now I would like to migrate tohyper
without choosing a particular framework. So if we need to address this on the backend, it looks like we'll need to add some middleware. Here are my current thoughts on that:conduit-static
and its eventual replacement if we migrate to a framework. I don't think we need to handle this for normal API request (but could be convinced otherwise).//
then I propose special casing just this scenario in a new middleware layer (unless we decide we want full directory traversal protection for all routes).Summary
In summary, I'd like to resurrect the PR to switch to hyper. I think we have 3 choices for a new middleware layer:
conduit-static
already (naively) protects against directory traversal on the filesystem and if we switch to a new framework in the future we will have to ensure it provides traversal protection before hitting the file system.//
prefix for any remaining cargo clients that haven't synced with the registry yet. I think we will need at least this option if there are any such clients remaining in the logs. We will still need to ensure that any future framework handles traversal when serving static files.Right now I'm leaning toward option 1, unless we have some
cargo
clients that haven't synced with the registry, then option 2. We can go with option 3 if we want to be more cautious and future-proof. Of course, I'm also open to any other suggestions here as well.