Closed tforgione closed 6 years ago
I've just pushed a branch off master to this repo, static-file-support
, did you want to create a PR into that branch @colinbankier so we can review/help?.
Thinking here being that we can then land further PR for static file related work in that branch (e.g. sorting async as you outlined) and eventually push the entire thing into master with a final look over.
It is exciting to see this progress!
Regarding security, I think the current code protects against normal traversal attacks, but could be extended to cover symlink traversal. I am not too familiar with the domain, so take the following with a pinch of salt.
If an attacker is in a position to put a symlink in the base path (a partially compromised server, a server that unpacks uploaded tar/zip files, etc), then they can currently escalate to reading arbitrary paths.
I'd suggest configurable symlink resolution, with a default of AllowInternal
which behaves the same as the Integer 32 playground.
enum SymlinkFollowing {
Disallow, // no links, error if path changes after canonicalize()
AllowInternal, // final file must be in base path, use canonicalize() as per shepmaster's example
AllowExternal, // links in base path followed to arbitrary locations, current normalize_path(...)
}
Even if you are resolving symlinks and filtering locations, an attacker can still swap in a new symlink after the call to canonicalize()
but before the file get opened.
I doubt we can beat this without resorting to OS specific code, like O_NOFOLLOW.
Checking is_symlink()
after the canonicalized path is opened would force the attacker to win two races back-to-back to avoid detection, but doesn't completely prevent the attack.
No async yet unfortunately - @illicitonion did you happen to look further into this?
Unfortunately not, I've been travelling for a bit (and will be for another week or so); will update here if I find some time, but happy for other folks to investigate too :)
I'll pick this is up again, and continue with async with help from tokio's new async file support: https://tokio.rs/blog/2018-05-tokio-fs/
Just noting I have some progress here, but waiting for https://github.com/gotham-rs/gotham/pull/246 to be merged before cleaning it up and opening a PR, as there are some conflicts with TestServer
, hyper 0.11, tokio-core and tokio-fs.
As a possible incentive for an upgrade to hyper 0.12, the body_image master branch (candidate for body_image 0.4.0 release) now has zero-copy/async. support for memory mapped http bodies using (glibc) madvise(SEQUENTIAL)
(for aggressive OS read-ahead), and using tokio-threadpool blocking
annotation. Some tokio level benchmark results in the CHANGELOG.md.
Would you all consider an integration into Gotham?
@dekellum thanks for pointing out body_image to us. I'll need to look at more closely to understand what it does in more detail, its intersection with more basic file serving needs, etc. In general I think it's a good strategy for Gotham to build on best-of-breed crates in the ecosystem instead of reinventing things where it makes sense - balancing the cost of managing 3rd party dependencies etc of course. Do you see body_image as providing a large chunk of file serving needs (i.e. replacing use of tokio-fs, compression handling etc)? Or providing some opt-in additional features along side? https://github.com/scottlamb/http-serve is another that has been suggested (different feature set to body_image - maybe not overlapping) - the current WIP doesn't utilise this. Interested in getting @jxs @nyarly 's thoughts.
Quick note for @dekellum that Hyper 0.12 is in-progress, it's not that we're not moving to it! :)
Sorry for delay in response to my initial post, which in hindsight was poorly timed.
The body-image async:: module adapters are completely tokio (reform, 0.1.x) compatible. These use the same strategy that tokio-fs uses: tokio_threadpool::blocking
. By using this most general mechanism, body-image can extend asynchronous compliance to cover the BodyImage
MemMap
state as well, something that tokio-fs doesn't yet and I suspect won't cover.
Also as compared with tokio-fs, body-image offers the integration with hyper 0.12 and http crate Request
/Response
builders, for its custom body types. In particular, it can offer hyper::body::Payload
adapters for output in a (client) Request<B>
or (server) Response<B>
with zero-copy MemMap
support, the desire for which was referenced in this comment by @scottlamb.
I think body-image is orthogonal, but hopefully is, or could be made, compatible with http-serve as a lower level body representation. Caching policy headers, range requests, etc. are not on my current agenda with body-image and it would be nice of http-serve to thoroughly cover these features in one place.
Here is the AsyncBodySink adapter of the current body-image 0.3.0 release. There is a symmetric AsyncBodyImage
as well as (zero-copy enabled) UniBodyImage
(heh, naming is hard) on the master branch, nearing a 0.4.0 release.
In summary, its early days, but I think body-image is well positioned to offer a best-of-breed integration of static files (memory-mapped or otherwise) into tokio and hyper. Thus I posted on this topic. More generally, gotham might also want a better story on how to handle giant HTTP bodies (like POST or PUT request bodies) which *BodySink
and *BodyImage
types can help with. Thanks for your consideration.
Thanks @dekellum - it's a great suggestion. I'm all for leveraging great work like this already existing to give Gotham a good file serving story :) I'll do some experimenting with how gotham, http-serve and body-image can play nicely together!
Please do and feel free to ask or post issues to dekellum/body-image for any problems or shortcomings you encounter. Thanks!
On your blog, you say that some important features are on the roadmap, including async static file serving. Is there any work on that for the moment ?