crim-ca / weaver

Weaver: Workflow Execution Management Service (EMS); Application, Deployment and Execution Service (ADES); OGC API - Processes; WPS; CWL Application Package
https://pavics-weaver.readthedocs.io
Apache License 2.0
23 stars 6 forks source link

Weaver passthrough all HTTP headers from client's request up to the providers #600

Closed perronld closed 4 months ago

perronld commented 4 months ago

Describe the bug

When a client POST a task to Weaver, all HTTP headers from the POST request is passed as-is to the WPS provider, which can lead into unexpected behavior and potential security issue.

How to Reproduce

If the WPS provider is hosted behind Cloudflare, Cloudflare will reply with a 400 error with no reason or message because the Host header is invalid.

Expected behavior

Only relevant headers should be passed-through to WPS provider, such as Authentication or Tokens that needs to be passed to the WPS provider for the task to run successfully. A .ini entry such as weaver.wps_client_headers_whitelist should be added.

fmigneault commented 4 months ago

Sounds good. Do you know of any other headers than Host that could be a problem? All known problematic headers could be provided by default in a filtering list. One issue I can foresee with the whitelist approach is that some extra headers that could be relevant might get lost, such as X-WPS-Output-Context.

perronld commented 4 months ago

Uhmm... regex whitelist? '^(X-WPS-.*|Authorization)$' ?

fmigneault commented 4 months ago

I'm more concerned about over-filtering headers and breaking some obscure HTTP functionality. Removing headers that are known to cause problems is "safer" IMO.

perronld commented 4 months ago

Weaver can add its own HTTP headers when required. Since Weaver is a workflow manager rather than an HTTP proxy, the HTTP headers shouldn't be blindly passed from a client to a provider.

fmigneault commented 4 months ago

In the case of WPS providers, it somewhat acts as a proxy with a conversion layer on top for WPS <-> OGC API - Processes representation. In this case, the cookies and auth details must be passed down to the provider service. Since Weaver cannot know which headers are intended for the downstream WPS, it currently passes all of them. We can however filter out the ones that are known to cause problems like Host.

perronld commented 4 months ago

Ok then, would weaver.wps_client_headers_denylist be a good wording for the option?

fmigneault commented 4 months ago

Yes. And it would have Host only by default to start with.