Open untitaker opened 8 years ago
Sounds reasonable to me.
I'd love to also have a layer 0. which handles GET and HEAD serving (including conditional GETs and range requests) on top of an abstract "entity" (HTTP's term) without involving the filesystem.
Go provides something like this; see http.ServeContent.
I needed 0. for my security camera NVR project, which dynamically creates .mp4
files based on its SQLite3 database and segments of on-disk files. I have some apparently-working, automatically tested code I'd be happy to share if desired. There are probably some cases in the RFC that I didn't implement correctly after my quick skim, and I haven't even tried to do some others yet. But currently my interface is roughly this:
/// A static HTTP entity for GET and HEAD serving.
pub trait Entity<Error> where Error: From<io::Error> {
/// Returns the length of the entity in bytes.
fn len(&self) -> u64;
/// Writes bytes within this entity indicated by `range` to `out`.
fn write_to(&self, range: Range<u64>, out: &mut io::Write) -> Result<(), Error>;
fn content_type(&self) -> mime::Mime;
fn etag(&self) -> Option<&header::EntityTag>;
fn last_modified(&self) -> &header::HttpDate;
}
/// Serves GET and HEAD requests for a given byte-ranged entity.
/// Handles conditional & subrange requests.
///
/// TODO: does it make sense for the caller to add Expires, Cache-Control, and Vary headers first?
/// or do they need to be added conditionally based on the response type we serve?
pub fn serve<Error>(entity: &Entity<Error>, req: &Request, mut res: Response<Fresh>)
-> Result<(), Error> where Error: From<io::Error> { ... }
My code directly uses hyper today, not iron. But if iron provided 0., that'd be a compelling reason for projects such as mine to use it. (There are probably other things I'll need that iron already provides, but my code's just a prototype now, I'm just learning the language, and I started learning by doing things with the lower layer.)
Also, my code is using the deprecated hyper 0.9.x synchronous interface rather than the not-ready-yet hyper 0.10.x async master or tokio interfaces. I'm looking forward to changing it once something else is stabilized; giving it chunks of preassembled Vec<u8>
would be better for my code.
@scottlamb You can pass arbitrary T: Read
to Response::with
, since they all implement Modifier
. You need to set Content-Type
manually, but the response body already gets omitted automatically in the case of a HEAD request.
Oh. That's totally wrong for my use case, actually, it introduces another copy here:
impl <R: io::Read + Send> WriteBody for BodyReader<R> {
fn write_body(&mut self, res: &mut ResponseBody) -> io::Result<()> {
io::copy(&mut self.0, res).map(|_| ())
}
}
I'm hoping a future version of hyper will allow me to just hand over chunks of &[u8]
slices with ownership promised in some way; I was just making a proposal for that on hyper issue 934.
And I'm hoping iron will provide a layer above hyper's (current or future) interface that handles conditional GET, range serving, and whatnot without forcing me to write stuff out to the filesystem first (which I absolutely cannot afford to do; I'm generating huge files from small pieces on demand).
This is not copying to a new section in RAM or any storage medium on the server, it is writing to the output stream.
It takes my &[u8]
, copies it into an internal buffer within io::copy
, and then copies it into hyper's output buffer. There are two copies, or one more than is necessary with the interface that hyper currently provides. Here's the definition of std::io::copy
:
#[stable(feature = "rust1", since = "1.0.0")]
pub fn copy<R: ?Sized, W: ?Sized>(reader: &mut R, writer: &mut W) -> io::Result<u64>
where R: Read, W: Write
{
let mut buf = [0; super::DEFAULT_BUF_SIZE];
let mut written = 0;
loop {
let len = match reader.read(&mut buf) {
Ok(0) => return Ok(written),
Ok(len) => len,
Err(ref e) if e.kind() == ErrorKind::Interrupted => continue,
Err(e) => return Err(e),
};
writer.write_all(&buf[..len])?;
written += len as u64;
}
}
If performance is that much more important than convenience, why not override response.body
with your own struct that impls WriteBody
?
Why should it be necessary to choose between the two? The serve
interface I showed above provides both. I was thinking this was generic code within iron's scope of responsibility, but if you disagree I'll just do it in a separate crate.
I'm just saying that Iron already provides the necessary low-level access to Hypers APIs for your project. One variant (using a T: Read as Modifier) is a shortcut for the other (manually setting the response body), and IMO covers a majority of the usecases with very little boilerplate and very little complexity in Iron as well.
I'm not opposed to your abstraction at all, at the same time I would strongly prefer shortcuts like mentioned in the OP for the majority of usecases. Either way I think those things shouldn't be part of Iron core for the simple reason that they don't need to be. Particularly etag and last-modified semantics pull in dependencies that minimalist API servers wouldn't want to pay the cost for
Ahh. Yes, I think you're right that I could implement serve
on top of iron instead of directly on top of hyper. I did the latter in part because I wanted to start at the bottom as I learned the pieces, and in part because iron didn't add anything I needed at the moment. If iron had provided this for me, I would have happily taken the extra dependency right away...
When you say "shouldn't be a part of iron core", do you mean it shouldn't be a part of crate "iron"? I don't feel strongly that it should be. I'd be happy with in "iron", in "staticfile", or in a new crate between. I do think that staticfile should be built on top of it—it doesn't make sense to provide static file serving without conditional GET and range handling, and it doesn't make sense to have two implementations of that. But if you disagree, I'll just go do my own thing.
I don't agree with what you're saying about etag and last-modified pulling in dependencies. The Entity
trait definition and serve
function have no dependencies beyond hyper now. Some Entity
implementations may require cryptographic functions for etag and such but others could just return None
. There's useful, non-trivial code that is dependency-free. And the staticfile implementation could provide both etag and last-modified based on the result of the stat
syscall; this is what Apache, nginx, etc. do.
If you want to build a high-level abstraction around Last-Modified you're probably going to depend on the time crate.
That's true. Currently my serve
implementation doesn't depend on the time crate (only hyper), but I suppose I should be setting Date
myself and ensuring Last-Modified
is consistent with it rather than relying on hyper setting Date
later on; as is I'm not compliant with RFC 2616 section 14.29:
An origin server MUST NOT send a Last-Modified date which is later
than the server's time of message origination. In such cases, where
the resource's last modification would indicate some time in the
future, the server MUST replace that date with the message
origination date.
But why is this a problem? hyper depends on time. So what problem does it create for iron to also depend on time?
Ah, I was assuming there's an extra cost because staticfile has everything related to that functionality within an optional feature. It appears https://github.com/alexcrichton/filetime would come as a new dependency. I don't think that's necessary for the abstraction, but probably for the impls that should IMO be included.
Anyway at this point I'd like to see a PoC pull request or new crate that does what you want to do. We can then move that into staticfile and expose the new API as well.
Sounds good. I will start polishing my code and adapting it to iron. (At my usual work-a-little-bit-every-night-after-my-kid-goes-to-bed pace...)
Quick note: I still haven't adapted my code to the Iron interface (I'm slow and trying to do a lot of things with my project at once), but I have at least extracted it into a separate crate and published it to github as scottlamb/http-entity, so you can look at the current code if you're curious.
Currently staticfile consists of two main features:
Currently staticfile exposes only 2., I'd like it to expose 1. as completely independent feature, for cases where I want to write the logic that maps from request URL to filepath myself.
Flask has a
send_file
function, which takes an absolute filepath (and implicitly a request) and gives back a response. Here's my API proposal for Iron:Perhaps returning a` is more consistent with the rest of core-Iron's API, but I seriously doubt the usefulness of that.
IronResult<T>
where T: Modifier