SuaveIO / suave

Suave is a simple web development F# library providing a lightweight web server and a set of combinators to manipulate route flow and task composition.
https://suave.io
Other
1.32k stars 198 forks source link

Using RegEx for parsing the boundary from the ContentType header. #760

Closed eriadam closed 2 years ago

eriadam commented 2 years ago

As detailed in the issue #759, the current parsing of the boundary does not respect the charset property.

The PR proposes a fix to use a regular expression to parse the boundary from the ContentType header value based on RFC2046. It allows alphanumeric characters, punctuations and spaces (except at the end). Quotation marks seem to be optional as well.

let parseBoundary contentType =
  let pattern = "boundary=\"?([a-zA-Z0-9'\(\)+_,-.\/:=? ]*)(?<! )\"?"
  Regex.Match(contentType, pattern).Groups.[1].Value
| Choice1Of2 ce when String.startsWith "multipart/form-data" ce ->
  let boundary = "--" + parseBoundary ce

Changed for "multipart/form-data" and "multipart/mixed" in the code.


Input:

Content-Type = "multipart/form-data; charset=utf-8; boundary=__X_PAW_BOUNDARY__"

Original result:

 --utf-8; boundary=__X_PAW_BOUNDARY__

New results:

'multipart/form-data; charset=utf-8; boundary=__X_PAW_BOUNDARY__' => __X_PAW_BOUNDARY__
'multipart/form-data; boundary=__X_PAW_BOUNDARY__' => __X_PAW_BOUNDARY__
'multipart/form-data; boundary=__X_PAW_BOUNDARY__ charset=utf-8;' => __X_PAW_BOUNDARY__
'multipart/form-data;' => (null)
ademar commented 2 years ago

Hi @eriadam

Your PR as it is now it breaks a bunch of tests. The problem is the pattern you are using only selects word characters while boundaries can include all sorts of non white space characters. I think the correct regex pattern would be "boundary=(\S*)" instead.

Also for completeness it would be nice if you would include a test case.

Thanks!

eriadam commented 2 years ago

Hi @ademar, thanks, yes, I've seen, I will look into it in the evening.

eriadam commented 2 years ago

Hi @ademar , I updated the solution, please have a look if it is OK like this. Cheers

ademar commented 2 years ago

It looks great, thanks very much. I'll push a new nuget version with these changes tomorrow.