Open jethrogb opened 8 years ago
I agree that the lifetime situation feels difficult. If I could fix anything, I'd fix the feeling of trying to use httparse
.
As for your specific example, I don't believe allocationg a Vec
of headers is necessary, unless you need to at runtime decide how many is the maximum allowed.
As for your specific example, I don't believe allocationg a Vec of headers is necessary, unless you need to at runtime decide how many is the maximum allowed.
You need a buffer that's as big as the input slice headers
. You can't do this statically until we get integer generics.
Sure. Who provides the headers
slice? I'm actually quite curious at this design, because I haven't experienced it both in sync hyper and async hyper, and I haven't seen others use it that way in other mini http implementations.
For reference, this is how hyper 0.9 would keep reading into a buffer synchronously and try to parse again: https://github.com/hyperium/hyper/blob/0.9.x/src/http/h1.rs#L869-L899
I'm actually quite curious at this design, because I haven't experienced it both in sync hyper and async hyper, and I haven't seen others use it that way in other mini http implementations.
For reference, this is how hyper 0.9 would keep reading into a buffer synchronously and try to parse again: https://github.com/hyperium/hyper/blob/0.9.x/src/http/h1.rs#L869-L899
This looks like it basically runs the same loop as me above, except with a fixed max headers, which means you're exactly avoiding the issues I'm running into. I'd like the caller to specify the max headers as well as avoid large additional allocations (either on the heap or stack) during the parsing. Also I'd like to keep the httparse Request
representation, in hyper it you're just interpreting the Request directly before returning.
Been thinking about this, what if Headers were not an array of slices, but array of offsets? It'd remove the lifetimes, but require you to slice the buffer yourself with the offsets to get the name and value.
Another option is to make the parse function to update the lifetime of the Request. This might require an unsafe transmute, which would require that parse ensures all of the fields of Request, include the headers array, are reset, to prevent any dangling references.
What about merging new
and parse
into one associated function?
fn parse(headers: &'h mut [Header<'b>], buf: &'b [u8]) -> Result<(usize,Request<'h, 'b>)>
You'd need to change the Status
enum as well...
That could be done, but it still ties the lifetime of the Header slice to the buf, and it makes the Status
trickier, as you said.
I think a reset()
method can be added to Request
, that will empty all the fields and reset the Header slice, and update it's lifetime to allow a new buffer to be used.
I wrote a little Python wrapper for this library. From the perspective of writing a Python wrapper with PyO3 the lifetime on Headers is problematic.
If you're using the flow of "try to parse, if not tell the Python caller to come back with more data" you have to re-allocate headers
every time because you can't persist anything with a Header
in it (including the buffer) to Python because of the lifetime (PyO3 can't hand off lifetimes to Python).
The lifetime requirement on the data is not a problem, PyO3 let's you access the underlaying memory without copying stuff.
I would like to build a function
that reads as much from
stream
intobuf
as necessary to get aComplete
return fromRequest::parse
. This turns out to not be trivial and require lots of extra allocations and work.Here's what I came up with:
The main issues are having to allocate a new array of temporary headers for every iteration, and having to parse the succesful result twice.
I think this is partially Rust's fault, but also partially
httparse
for having a not so great API. For example, the lifetime of everything is fixed uponRequest
creation, so that a parse failure doesn't release the borrow of the input buffer.