aars / rack-bodyparser

Rack Middleware for parsing request body.
MIT License
6 stars 0 forks source link

Access to headers whilst parsing body #2

Closed alexjfisher closed 6 years ago

alexjfisher commented 7 years ago

Hi!

We're currently trying to use rack-parser to parse various incoming json webhooks. https://github.com/voxpupuli/puppet-webhook/pull/7 One of the issues we have is that it'd be much easier if we could access the other headers (not just content-type) from within our parser class. https://github.com/achiu/rack-parser appears to be unmaintained and we noticed you're planing a rugygems release. Would you be interested in a pull request? Or maybe you have some suggestions on how we should go about it? For instance, being able to specify more than one parser per content-type based on a second header would be awesome, (but that would be more complicated).

Many thanks, Alex

aars commented 7 years ago

@alexjfisher I just released this to rubygems. https://rubygems.org/gems/rack-bodyparser

It's unclear to me if you want to: A) Set up parsers matching other headers than content-type alone (or, Rack's media_type to be more specific). For example, different parsers for different user-agents, or... B) Have access to other headers (or simply Rack's Request object) within the parser class. (Currently the parsers are invoked with only the request body)

Option B is easy. Two ways we could do this: replace the body arg of parser.call with the entire env, in which case the parser should extract the body first. Or, we could pass the entire env, or Request object, or a subset of it, as a second arg. (here)

Option A shouldn't be that hard either, but I want to keep the setup to be as clean/simple as possible. Maybe support hashes as keys for :parsers as well? Something like:

use Rack::BodyParser, :parsers => { 
  # string key, matches media_type
  'application/json' => My::Default::JSON::Parser,
  # hash key, matches whatever
  {:user_agent => 'custom-client'} => My::Custom::Client::Parser,
  # hash key, another example
  {:host => 'alternative-hostname.tld', 
   :media_type => 'application/json'} => My::Alternative::Host::Parser 
}

Any other ideas on how to handle this elegantly? One issue here is that you can easily set up multiple parsers for one request. Should that be possible? Is that useful? Or should we just pick the first match and use only that parser? If we are using/allowing more than one parser per request, do they all manipulate the same parsed_body? Could you describe a use-case for a "more than one parser per content-type based on a second header"? Could the same not be achieved by giving the parser full access to Rack's env or Request, so, option B.

I have a gut feeling that allowing multiple parsers per request would end up biting users who aren't too careful with matching headers, which could be a huge performance hit.

After writing this, and some consideration, I think we should go for option A AND B. But only allow one parser per request (pick the first match), and pass either env or Request (probably the latter) to the parser class instead of the body. Thoughts?

Input, ideas and PRs are very welcome.

aars commented 7 years ago

I've been cleaning up code, adding tests and preparing for a 1.0 release. I'd like to add this feature to the 1.0 release.

I have decided that, to allow access to headers in parsers, we will pass the env as a second argument to the parser. If needed, the parser can create it's own Rack::Request instance with it.

I haven't decided on how to handle the configuration yet. I still think we should never apply multiple parsers. I believe it is too prone to errors. If additional parsing is required it can be handled by the parser itself.

Please have a look at the release/1.0 branch, and use that branch for any PRs.

alexjfisher commented 7 years ago

It may be a couple of days before I can take a proper look, but thanks for your help and work.

aars commented 7 years ago

Sure, no problem.

Like I said, I'd like to include the feature in the 1.0 release. If you could let me know how you'd like to use it I'd happily implement it.

alexjfisher commented 7 years ago

Thanks, that's really generous. We're currently using rack-parser, but as you can see, not having access to the headers is leading to overly complicated code. Not sure if we'd use anything else from env, or whether only having the headers passed to us would be good enough. If people require something other than the body and headers, I guess they might be better off with their own middleware? To be honest, it's all new to me, so I'm probably not the best judge!

aars commented 7 years ago

Well, env is mostly headers anyway: http://www.rubydoc.info/github/rack/rack/file/SPEC

I'll add passing env to the parser. And won't bother with "option A" for now. I think that'll do.

aars commented 6 years ago

Done and released as v1.0.0.

Rack's env is now passed to your parser if it has a setter method for it, so respond_to? env=. A small example can be found in the updated README. So, it's optional (and backwards compatible)

I believe this is enough to handle "option A" as well. Where the main identifier for the parser is still the content-type, and parsers can decide on other parsing strategies when needed.