hapijs / subtext

HTTP payload parser
Other
24 stars 25 forks source link

Increase separation of concerns and configurability of parsing #90

Open matthieusieben opened 3 years ago

matthieusieben commented 3 years ago

Support plan

Context

What problem are you trying to solve?

I want to be able to add custom parsing capabilities to my Hapi server.

While investigating about how this could be achieved, I came across the hard coded list of parsers in this module.

It would be very convenient to be able to be able to pass the list of parsers as an option. IMHO the default list of parsers should not even be coded in this module at all (though I understand this would be a big breaking change).

Do you have a new or modified API suggestion to solve the problem?

Be able to provide a custom list of parsers in the options. The default parsers would look like:

[
  // Binary

  {
    mime: 'application/octet-stream',
    parse: (payload) => (payload.length ? payload : null),
  },

  // Text

  {
    mime: /^text\/.+$/,
    parse: (payload) => payload.toString('utf8'),
  },

  // JSON

  {
    mime: /^application\/(?:.+\+)?json$/,
    parse: (payload, mime, options) => {
      if (!payload.length) {
        return null
      }

      try {
        return Bourne.parse(payload.toString('utf8'), { protoAction: options.protoAction })
      } catch (err) {
        const error = Boom.badRequest('Invalid request payload JSON format', err)
        error.raw = payload
        throw error
      }
    },
  },

  // Form-encoded

  {
    mime: 'application/x-www-form-urlencoded',
    parse: (payload, mime, options) => {
      const parse = options.querystring || Querystring.parse
      return payload.length ? parse(payload.toString('utf8')) : {}
    },
  },
]
matthieusieben commented 3 years ago

An additional argument in favor of this would be the ability to check for the availability of a parser (and throw an unsupportedMediaType) before even starting to consume the stream's body (potentially saving a lot of resources).

SK-FComputer commented 3 years ago

92

SK-FComputer commented 3 years ago

@matthieusieben have you made any progress on this ?

SK-FComputer commented 3 years ago

I've used patch-package to patch @hapi/subtext@7.0.3 for the project I'm working on, to create what is needed to support custom parsers.

Here is the diff that solved my problem:

subtext

diff --git a/node_modules/@hapi/subtext/lib/index.js b/node_modules/@hapi/subtext/lib/index.js
index 28b91de..888a432 100644
--- a/node_modules/@hapi/subtext/lib/index.js
+++ b/node_modules/@hapi/subtext/lib/index.js
@@ -107,7 +107,7 @@ internals.parse = async function (req, tap, options, contentType) {
     // Output: 'data'

     const payload = await Wreck.read(source, { timeout: options.timeout, maxBytes: options.maxBytes });
-    return internals.object(options, payload, contentType.mime);
+    return await internals.object(options, payload, contentType.mime);
 };

@@ -172,42 +172,73 @@ internals.raw = async function (req, tap, options) {
 };

-internals.object = function (options, payload, mime) {
-
-    // Binary
-
-    if (mime === 'application/octet-stream') {
-        return payload.length ? payload : null;
-    }
-
-    // Text
+internals.object = async function (options, payload, mime) {
+
+    // Default parsers
+    let parsers = [
+        {
+            // Binary
+            mime: 'application/octet-stream',
+            parse: (unparsedPayload) =>
+                unparsedPayload.length ? unparsedPayload : null,
+        },
+        {
+            // Text
+            mime: /^text\/.+$/,
+            parse: (unparsedPayload) => unparsedPayload.toString('utf8'),
+        },
+        {
+            // JSON
+            mime: /^application\/(?:.+\+)?json$/,
+            parse: (unparsedPayload) => {
+                if (!unparsedPayload.length) {
+                    return null;
+                }

-    if (mime.match(/^text\/.+$/)) {
-        return payload.toString('utf8');
+                try {
+                    return Bourne.parse(unparsedPayload.toString('utf8'), {
+                        protoAction: options.protoAction,
+                    });
+                } catch (err) {
+                    const error = Boom.badRequest(
+                        'Invalid request payload JSON format',
+                        err
+                    );
+                    error.raw = payload;
+                    throw error;
+                }
+            },
+        },
+        {
+            // Form-encoded
+            mime: 'application/x-www-form-urlencoded',
+            parse: (unparsedPayload) => {
+                const parse = options.querystring || Querystring.parse;
+                return unparsedPayload.length
+                    ? parse(unparsedPayload.toString('utf8'))
+                    : {};
+            },
+        },
+    ];
+
+    if (options.customParsers && options.customParsers.length) {
+        parsers = options.customParsers.concat(parsers);
     }

-    // JSON
-
-    if (/^application\/(?:.+\+)?json$/.test(mime)) {
-        if (!payload.length) {
-            return null;
-        }
-
-        try {
-            return Bourne.parse(payload.toString('utf8'), { protoAction: options.protoAction });
-        }
-        catch (err) {
-            const error = Boom.badRequest('Invalid request payload JSON format', err);
-            error.raw = payload;
-            throw error;
+    const parser = parsers.find((parser) => {
+        if (parser.mime) {
+            if (parser.mime instanceof RegExp) {
+                return parser.mime.test(mime);
+            } else if (parser.mime instanceof String) {
+                return parser.mime === mime;
+            }
         }
-    }

-    // Form-encoded
+        return false;
+    });

-    if (mime === 'application/x-www-form-urlencoded') {
-        const parse = options.querystring || Querystring.parse;
-        return payload.length ? parse(payload.toString('utf8')) : {};
+    if (parser) {
+        return await parser.parse(payload);
     }

     const error = Boom.unsupportedMediaType();

hapi

diff --git a/node_modules/@hapi/hapi/lib/config.js b/node_modules/@hapi/hapi/lib/config.js
index 5279a18..9cd765e 100644
--- a/node_modules/@hapi/hapi/lib/config.js
+++ b/node_modules/@hapi/hapi/lib/config.js
@@ -142,6 +142,7 @@ internals.routeBase = Validate.object({
     payload: Validate.object({
         output: Validate.valid('data', 'stream', 'file').default('data'),
         parse: Validate.boolean().allow('gunzip').default(true),
+        customParsers: Validate.array().items(Validate.object({mime: Validate.alternatives(Validate.object(), Validate.string()), parse: Validate.function()})),
         multipart: Validate.object({
             output: Validate.valid('data', 'stream', 'file', 'annotated').required()
         })

@types/hapi

diff --git a/node_modules/@types/hapi__hapi/index.d.ts b/node_modules/@types/hapi__hapi/index.d.ts
index 564ead8..acc7273 100644
--- a/node_modules/@types/hapi__hapi/index.d.ts
+++ b/node_modules/@types/hapi__hapi/index.d.ts
@@ -1306,6 +1306,18 @@ export type PayloadOutput = 'data' | 'stream' | 'file';
  */
 export type PayloadCompressionDecoderSettings = object;

+
+export interface RouteOptionsPayloadCustomParser {
+     /**
+     * A string or a RegExp instance to test with content-type mime of request.
+     */
+    mime: string | RegExp;
+    /**
+     * A function that takes the request payload as a Buffer, and returns the desired output.
+     */
+    parse: (unparsedPayload: Buffer) => Promise<any>;
+}
+
 /**
  * Determines how the request payload is processed.
  * [See docs](https://github.com/hapijs/hapi/blob/master/API.md#-routeoptionspayload)
@@ -1320,7 +1332,7 @@ export interface RouteOptionsPayload {
      * * multipart/form-data
      * * text/*
      * A string or an array of strings with the allowed mime types for the endpoint. Use this settings to limit the set of allowed mime types. Note that allowing additional mime types not listed
-     * above will not enable them to be parsed, and if parse is true, the request will result in an error response.
+     * above will not enable them to be parsed, and if parse is true, the request will result in an error response if a custom parser is not defined.
      * [See docs](https://github.com/hapijs/hapi/blob/master/API.md#-routeoptionspayloadallow)
      */
     allow?: string | string[] | undefined;
@@ -1403,6 +1415,12 @@ export interface RouteOptionsPayload {
      */
     parse?: boolean | 'gunzip' | undefined;

+    /**
+     * @default none.
+     * Add custom parsers for unsupported content-types.
+     */
+    customParsers?: RouteOptionsPayloadCustomParser[] | undefined;
+
     /**
      * @default to 10000 (10 seconds).
      * Payload reception timeout in milliseconds. Sets the maximum time allowed for the client to transmit the request payload (body) before giving up and responding with a Request Timeout (408)
devinivy commented 3 years ago

@SK-FComputer thanks for offering that patch— it's appreciated. We are going to plan out the hapi feature a bit more in https://github.com/hapijs/hapi/issues/4223, and once that is done then PRs will be welcome.