gotham-rs / gotham

A flexible web framework that promotes stability, safety, security and speed.
https://gotham.rs
Other
2.23k stars 125 forks source link

Async static file serving #55

Closed tforgione closed 6 years ago

tforgione commented 6 years ago

On your blog, you say that some important features are on the roadmap, including async static file serving. Is there any work on that for the moment ?

bradleybeddoes commented 6 years ago

I've just pushed a branch off master to this repo, static-file-support, did you want to create a PR into that branch @colinbankier so we can review/help?.

Thinking here being that we can then land further PR for static file related work in that branch (e.g. sorting async as you outlined) and eventually push the entire thing into master with a final look over.

millardjn commented 6 years ago

It is exciting to see this progress!

Regarding security, I think the current code protects against normal traversal attacks, but could be extended to cover symlink traversal. I am not too familiar with the domain, so take the following with a pinch of salt.

Following symlinks outside of the base path

If an attacker is in a position to put a symlink in the base path (a partially compromised server, a server that unpacks uploaded tar/zip files, etc), then they can currently escalate to reading arbitrary paths.

I'd suggest configurable symlink resolution, with a default of AllowInternal which behaves the same as the Integer 32 playground.

enum SymlinkFollowing {
    Disallow, // no links, error if path changes after canonicalize()
    AllowInternal, // final file must be in base path, use canonicalize() as per shepmaster's example
    AllowExternal, // links in base path followed to arbitrary locations, current normalize_path(...)
}

Symlink races

Even if you are resolving symlinks and filtering locations, an attacker can still swap in a new symlink after the call to canonicalize() but before the file get opened. I doubt we can beat this without resorting to OS specific code, like O_NOFOLLOW. Checking is_symlink() after the canonicalized path is opened would force the attacker to win two races back-to-back to avoid detection, but doesn't completely prevent the attack.

illicitonion commented 6 years ago

No async yet unfortunately - @illicitonion did you happen to look further into this?

Unfortunately not, I've been travelling for a bit (and will be for another week or so); will update here if I find some time, but happy for other folks to investigate too :)

colinbankier commented 6 years ago

I'll pick this is up again, and continue with async with help from tokio's new async file support: https://tokio.rs/blog/2018-05-tokio-fs/

colinbankier commented 6 years ago

Just noting I have some progress here, but waiting for https://github.com/gotham-rs/gotham/pull/246 to be merged before cleaning it up and opening a PR, as there are some conflicts with TestServer, hyper 0.11, tokio-core and tokio-fs.

dekellum commented 6 years ago

As a possible incentive for an upgrade to hyper 0.12, the body_image master branch (candidate for body_image 0.4.0 release) now has zero-copy/async. support for memory mapped http bodies using (glibc) madvise(SEQUENTIAL) (for aggressive OS read-ahead), and using tokio-threadpool blocking annotation. Some tokio level benchmark results in the CHANGELOG.md.

Would you all consider an integration into Gotham?

colinbankier commented 6 years ago

@dekellum thanks for pointing out body_image to us. I'll need to look at more closely to understand what it does in more detail, its intersection with more basic file serving needs, etc. In general I think it's a good strategy for Gotham to build on best-of-breed crates in the ecosystem instead of reinventing things where it makes sense - balancing the cost of managing 3rd party dependencies etc of course. Do you see body_image as providing a large chunk of file serving needs (i.e. replacing use of tokio-fs, compression handling etc)? Or providing some opt-in additional features along side? https://github.com/scottlamb/http-serve is another that has been suggested (different feature set to body_image - maybe not overlapping) - the current WIP doesn't utilise this. Interested in getting @jxs @nyarly 's thoughts.

whitfin commented 6 years ago

Quick note for @dekellum that Hyper 0.12 is in-progress, it's not that we're not moving to it! :)

dekellum commented 6 years ago

Sorry for delay in response to my initial post, which in hindsight was poorly timed.

The body-image async:: module adapters are completely tokio (reform, 0.1.x) compatible. These use the same strategy that tokio-fs uses: tokio_threadpool::blocking. By using this most general mechanism, body-image can extend asynchronous compliance to cover the BodyImage MemMapstate as well, something that tokio-fs doesn't yet and I suspect won't cover.

Also as compared with tokio-fs, body-image offers the integration with hyper 0.12 and http crate Request/Response builders, for its custom body types. In particular, it can offer hyper::body::Payload adapters for output in a (client) Request<B> or (server) Response<B> with zero-copy MemMap support, the desire for which was referenced in this comment by @scottlamb.

I think body-image is orthogonal, but hopefully is, or could be made, compatible with http-serve as a lower level body representation. Caching policy headers, range requests, etc. are not on my current agenda with body-image and it would be nice of http-serve to thoroughly cover these features in one place.

Here is the AsyncBodySink adapter of the current body-image 0.3.0 release. There is a symmetric AsyncBodyImage as well as (zero-copy enabled) UniBodyImage (heh, naming is hard) on the master branch, nearing a 0.4.0 release.

In summary, its early days, but I think body-image is well positioned to offer a best-of-breed integration of static files (memory-mapped or otherwise) into tokio and hyper. Thus I posted on this topic. More generally, gotham might also want a better story on how to handle giant HTTP bodies (like POST or PUT request bodies) which *BodySink and *BodyImage types can help with. Thanks for your consideration.

colinbankier commented 6 years ago

Thanks @dekellum - it's a great suggestion. I'm all for leveraging great work like this already existing to give Gotham a good file serving story :) I'll do some experimenting with how gotham, http-serve and body-image can play nicely together!

dekellum commented 6 years ago

Please do and feel free to ask or post issues to dekellum/body-image for any problems or shortcomings you encounter. Thanks!

colinbankier commented 6 years ago

271 is merged, I think we can close this. Future enhancements can be handled separately.