hapijs / hapi

The Simple, Secure Framework Developers Trust
https://hapi.dev
Other
14.63k stars 1.34k forks source link

Add ability to create a custom body parser #4223

Open matthieusieben opened 3 years ago

matthieusieben commented 3 years ago

Support plan

Context

What problem are you trying to solve?

I want to create a plugin that enables transparent yaml parsing of request payload. By transparent I mean that I would create an equivalence between yaml and json and transparently parse yaml and allow it to be used with routes expecting JSON.

My current workaround consists of monkey patching Subtext.parse to do something like:

const origParse = Subtext.parse
Subtext.parse = async function (req, tap, options) {
  const { payload, mime } = await origParse.call(this, req, tap, {
    ...options,
   allow: options.allow?.some(isJsonMime)
     ? [ ...options.allow, 'text/vnd.yaml', 'text/yaml', 'text/x-yaml', 'application/x-yaml' ]
     : options.allow,
  });

  if (options.parse === true && isYamlMime(mime)) {
    // Expose Yaml as plain JSON
    return { payload: parseYaml(payload), mime: 'application/json' };
  }

  return { payload, mime };
}

Do you have a new or modified API suggestion to solve the problem?

nope

Nargonath commented 3 years ago

I think that it seems like an interesting feature but I feel this would probably require some heavy work in the codebase. I'm not sure of how wide the changes would need to be.

kanongil commented 3 years ago

Yes, this is probably a substantial change. If we look into this, we could also consider supporting something like protobuf, where some kind of schema is needed to parse it.

Nargonath commented 3 years ago

Your suggestion about protobuf rings a bell and remind me about a vision that Eran expressed a while back and that the TSC shares regarding hapi becoming protocol agnostic rather than HTTP centric. That way it could supports many different protocols (RPC, HTTP, etc.). I expect this rework to take place in a bit of a distant future so perhaps there is something we can do to accomodate in the meantime.

matthieusieben commented 3 years ago

Given that all that was required for me to achieve this is to monkey patch Subtext.parse, it might not be complicated to allow customizing the parser to use through a route option. I am not sure about the exact API but I see a few places that could be adjusted to make this happen:

matthieusieben commented 3 years ago

Maybe something like a list of parsers to try out before throwing from @hapi/subtext when parse is true.


route.options.payload.parsers = [
  { mime: (mime) => mime === 'text/yaml', parse: (buffer, mime) => parseYaml(buffer.toString('utf8')) },
  { mime: 'text/yaml', parse: (buffer, mime) => parseYaml(buffer.toString('utf8')) },
  { mime: ['text/yaml'], parse: (buffer, mime) => parseYaml(buffer.toString('utf8')) },
  { mime: /^text\/yaml$/, parse: (buffer, mime) => parseYaml(buffer.toString('utf8')) },
]
matthieusieben commented 3 years ago

This makes me thinks that it might actually make sense to extract completely the internals.object method from subtext and use a list of parsers in the options instead. This would improve separation of concerns for @hapi/subtext, and reduce the external dependencies of that module (allowing dependent modules to only "load" the parsers they actually require).

Hapi would have a default route.options.payload.parsers options for routes that would look like:

[
  // Binary

  {
    mime: 'application/octet-stream',
    parse: (payload) => (payload.length ? payload : null),
  },

  // Text

  {
    mime: /^text\/.+$/,
    parse: (payload) => payload.toString('utf8'),
  },

  // JSON

  {
    mime: /^application\/(?:.+\+)?json$/,
    parse: (payload, mime, options) => {
      if (!payload.length) {
        return null
      }

      try {
        return Bourne.parse(payload.toString('utf8'), { protoAction: options.protoAction })
      } catch (err) {
        const error = Boom.badRequest('Invalid request payload JSON format', err)
        error.raw = payload
        throw error
      }
    },
  },

  // Form-encoded

  {
    mime: 'application/x-www-form-urlencoded',
    parse: (payload, mime, options) => {
      const parse = options.querystring || Querystring.parse
      return payload.length ? parse(payload.toString('utf8')) : {}
    },
  },
]
matthieusieben commented 3 years ago

@Nargonath @kanongil Any thoughts on this ?

kanongil commented 3 years ago

The payload parsing could definitely do with a rework, and making it pluggable seems very reasonable.

I'm not sure that your specific suggestion works that well though. We can't really have a complicated default value. How would you even document that?

Maybe more something like the encoder / decoder registration, and then you just list the parsers you want to enable?

server.parser('yaml', {
    match: ['text/vnd.yaml', 'text/yaml', 'text/x-yaml', 'application/x-yaml'],
    parse(payload, options) {
        …
    }
});

server.route({
    …,
    config: {
        payload: {
             parse: ['json', {
                 parser: 'yaml',
                 options: {
                     …
                 }
             }]
        }
    }
}

I guess the default would be: ['binary', 'text', 'json', 'qs']

Possibly the match could be excluded by making a lookup into the server.mime db.

SK-FComputer commented 3 years ago

I've used patch-package to patch @hapi/subtext@7.0.3 for the project I'm working on, to create what is needed to support custom parsers.

Here is the diff that solved my problem:

subtext

diff --git a/node_modules/@hapi/subtext/lib/index.js b/node_modules/@hapi/subtext/lib/index.js
index 28b91de..888a432 100644
--- a/node_modules/@hapi/subtext/lib/index.js
+++ b/node_modules/@hapi/subtext/lib/index.js
@@ -107,7 +107,7 @@ internals.parse = async function (req, tap, options, contentType) {
     // Output: 'data'

     const payload = await Wreck.read(source, { timeout: options.timeout, maxBytes: options.maxBytes });
-    return internals.object(options, payload, contentType.mime);
+    return await internals.object(options, payload, contentType.mime);
 };

@@ -172,42 +172,73 @@ internals.raw = async function (req, tap, options) {
 };

-internals.object = function (options, payload, mime) {
-
-    // Binary
-
-    if (mime === 'application/octet-stream') {
-        return payload.length ? payload : null;
-    }
-
-    // Text
+internals.object = async function (options, payload, mime) {
+
+    // Default parsers
+    let parsers = [
+        {
+            // Binary
+            mime: 'application/octet-stream',
+            parse: (unparsedPayload) =>
+                unparsedPayload.length ? unparsedPayload : null,
+        },
+        {
+            // Text
+            mime: /^text\/.+$/,
+            parse: (unparsedPayload) => unparsedPayload.toString('utf8'),
+        },
+        {
+            // JSON
+            mime: /^application\/(?:.+\+)?json$/,
+            parse: (unparsedPayload) => {
+                if (!unparsedPayload.length) {
+                    return null;
+                }

-    if (mime.match(/^text\/.+$/)) {
-        return payload.toString('utf8');
+                try {
+                    return Bourne.parse(unparsedPayload.toString('utf8'), {
+                        protoAction: options.protoAction,
+                    });
+                } catch (err) {
+                    const error = Boom.badRequest(
+                        'Invalid request payload JSON format',
+                        err
+                    );
+                    error.raw = payload;
+                    throw error;
+                }
+            },
+        },
+        {
+            // Form-encoded
+            mime: 'application/x-www-form-urlencoded',
+            parse: (unparsedPayload) => {
+                const parse = options.querystring || Querystring.parse;
+                return unparsedPayload.length
+                    ? parse(unparsedPayload.toString('utf8'))
+                    : {};
+            },
+        },
+    ];
+
+    if (options.customParsers && options.customParsers.length) {
+        parsers = options.customParsers.concat(parsers);
     }

-    // JSON
-
-    if (/^application\/(?:.+\+)?json$/.test(mime)) {
-        if (!payload.length) {
-            return null;
-        }
-
-        try {
-            return Bourne.parse(payload.toString('utf8'), { protoAction: options.protoAction });
-        }
-        catch (err) {
-            const error = Boom.badRequest('Invalid request payload JSON format', err);
-            error.raw = payload;
-            throw error;
+    const parser = parsers.find((parser) => {
+        if (parser.mime) {
+            if (parser.mime instanceof RegExp) {
+                return parser.mime.test(mime);
+            } else if (parser.mime instanceof String) {
+                return parser.mime === mime;
+            }
         }
-    }

-    // Form-encoded
+        return false;
+    });

-    if (mime === 'application/x-www-form-urlencoded') {
-        const parse = options.querystring || Querystring.parse;
-        return payload.length ? parse(payload.toString('utf8')) : {};
+    if (parser) {
+        return await parser.parse(payload);
     }

     const error = Boom.unsupportedMediaType();

hapi

diff --git a/node_modules/@hapi/hapi/lib/config.js b/node_modules/@hapi/hapi/lib/config.js
index 5279a18..9cd765e 100644
--- a/node_modules/@hapi/hapi/lib/config.js
+++ b/node_modules/@hapi/hapi/lib/config.js
@@ -142,6 +142,7 @@ internals.routeBase = Validate.object({
     payload: Validate.object({
         output: Validate.valid('data', 'stream', 'file').default('data'),
         parse: Validate.boolean().allow('gunzip').default(true),
+        customParsers: Validate.array().items(Validate.object({mime: Validate.alternatives(Validate.object(), Validate.string()), parse: Validate.function()})),
         multipart: Validate.object({
             output: Validate.valid('data', 'stream', 'file', 'annotated').required()
         })

@types/hapi

diff --git a/node_modules/@types/hapi__hapi/index.d.ts b/node_modules/@types/hapi__hapi/index.d.ts
index 564ead8..acc7273 100644
--- a/node_modules/@types/hapi__hapi/index.d.ts
+++ b/node_modules/@types/hapi__hapi/index.d.ts
@@ -1306,6 +1306,18 @@ export type PayloadOutput = 'data' | 'stream' | 'file';
  */
 export type PayloadCompressionDecoderSettings = object;

+
+export interface RouteOptionsPayloadCustomParser {
+     /**
+     * A string or a RegExp instance to test with content-type mime of request.
+     */
+    mime: string | RegExp;
+    /**
+     * A function that takes the request payload as a Buffer, and returns the desired output.
+     */
+    parse: (unparsedPayload: Buffer) => Promise<any>;
+}
+
 /**
  * Determines how the request payload is processed.
  * [See docs](https://github.com/hapijs/hapi/blob/master/API.md#-routeoptionspayload)
@@ -1320,7 +1332,7 @@ export interface RouteOptionsPayload {
      * * multipart/form-data
      * * text/*
      * A string or an array of strings with the allowed mime types for the endpoint. Use this settings to limit the set of allowed mime types. Note that allowing additional mime types not listed
-     * above will not enable them to be parsed, and if parse is true, the request will result in an error response.
+     * above will not enable them to be parsed, and if parse is true, the request will result in an error response if a custom parser is not defined.
      * [See docs](https://github.com/hapijs/hapi/blob/master/API.md#-routeoptionspayloadallow)
      */
     allow?: string | string[] | undefined;
@@ -1403,6 +1415,12 @@ export interface RouteOptionsPayload {
      */
     parse?: boolean | 'gunzip' | undefined;

+    /**
+     * @default none.
+     * Add custom parsers for unsupported content-types.
+     */
+    customParsers?: RouteOptionsPayloadCustomParser[] | undefined;
+
     /**
      * @default to 10000 (10 seconds).
      * Payload reception timeout in milliseconds. Sets the maximum time allowed for the client to transmit the request payload (body) before giving up and responding with a Request Timeout (408)
SK-FComputer commented 3 years ago

@kanongil any thoughts ?

kanongil commented 3 years ago

@SK-FComputer Yes, I think that you are spamming, posting this 3 places. Just post once, and link it from other issues. Also, it's quite rude to @ people for an immediate response when there is no urgency.

SK-FComputer commented 3 years ago

@kanongil Sorry for spamming, where should i leave this comment so that i can link to ? The 2 other places are in another git repository than here, and i was leaving those comments for other people to use if they ever find it.

Sorry, but i don't see tagging as rude, unless tagging someone at random. Your last response was an hour ago, so i thought a tag was warranted, and you could answer when the time is right.

devinivy commented 3 years ago

Hey @matthieusieben do you have any thoughts or feedback on your and Gil's proposals so far? I think it would be nice to plan out this work, and (with hapi v21 in mind) decide on whether we think there would need to be any breaking changes to enable this.

Ref. https://github.com/hapijs/subtext/issues/90

matthieusieben commented 3 years ago

In my opinion, @kanongil 's proposal looks great. It has the advantage to be much less verbose. Plugins could cause conflicts (e.g. two different plugins declare a yaml parser but with different match/parse()) but I don't think that's really a problem.

Additionally, this proposal seems compatible with my suggestion in https://github.com/hapijs/subtext/issues/90 (HapiJS would build the list of parsers passed as option to subtext, on a per-route basis).

matthieusieben commented 3 years ago

I would add a way to allow a parser to be used globally (= add a parser to the list of "default" parsers used by a server), similar to how we can define auth methods (server default and/or route based)

SK-FComputer commented 3 years ago

@matthieusieben @devinivy so i've hit another case where this is "a problem". When uploading a file with these options:

options: {
    payload: {
        allow: 'multipart/form-data',
        parse: true,
        output: 'data',
        multipart: { output: 'annotated' },
    },
},

The file is "parsed" thorugh the default parsers, and if a parser is not defined, it returns code 415 Unsupported Media Type. For example XML files can hit application/xml, seen in the files annotated parts.

SK-FComputer commented 3 years ago

Also this is caught in a try catch, so with my patch https://github.com/hapijs/hapi/issues/4223#issuecomment-956185628 (Which i have expanded on further, @ me for updates if needed.) https://github.com/hapijs/subtext/blob/035f4ae9e33afeb25277be0f53759472c128aca5/lib/index.js#L383-L390

Will catch a parser error, IE. XML missing closing tag etc., and return a buffer to the request validate if defined, and then to the request handler, and not a error to failAction in options.payload.failAction

kanongil commented 3 years ago

It seems that you found a minor bug. Unknown content-types are documented to return a 400 Bad Request response here: https://hapi.dev/api?v=20.2.0#-routeoptionspayloadparse - but really a 415 response is returned.

I would say that the existing implementation is correct, and that the docs need fixing.

SK-FComputer commented 3 years ago

@kanongil The 415 error code is not returned when parsing fails through https://github.com/hapijs/hapi/issues/4223#issuecomment-957903018 settings, and failAction is never hit. Which i think it should, unless there is a case where something else i expected.

SK-FComputer commented 2 years ago

I guess this wil never be implemented, just FIY, there is a new version of HAPI now, and this should be possible.

Here is the diff that solved my problem:

diff --git a/node_modules/@hapi/subtext/lib/index.js b/node_modules/@hapi/subtext/lib/index.js
index bd6fa72..6bd863f 100755
--- a/node_modules/@hapi/subtext/lib/index.js
+++ b/node_modules/@hapi/subtext/lib/index.js
@@ -175,44 +175,81 @@ internals.raw = async function (req, tap, options) {
 };

-internals.object = function (options, payload, mime) {

-    // Binary

-    if (mime === 'application/octet-stream') {
-        return payload.length ? payload : null;
-    }
-
-    // Text
-
-    if (mime.match(/^text\/.+$/)) {
-        return payload.toString('utf8');
+internals.object = async function (options, payload, mime) {        
+    // Default parsers
+    let parsers = [
+        {
+            // Binary
+            mime: 'application/octet-stream',
+            parse: (unparsedPayload) =>
+                unparsedPayload.length ? unparsedPayload : null,
+        },
+        {
+            // Text
+            mime: /^text\/.+$/,
+            parse: (unparsedPayload) => unparsedPayload.toString('utf8'),
+        },
+        {
+            // JSON
+            mime: /^application\/(?:.+\+)?json$/,
+            parse: (unparsedPayload) => {
+                if (!unparsedPayload.length) {
+                    return null;
+                }
+                try {
+                    return Bourne.parse(unparsedPayload.toString('utf8'), {
+                        protoAction: options.protoAction,
+                    });
+                } catch (err) {
+                    const error = Boom.badRequest(
+                        'Invalid request payload JSON format',
+                        err
+                    );
+                    error.raw = payload;
+                    throw error;
+                }
+            },
+        },
+        {
+            // Form-encoded
+            mime: 'application/x-www-form-urlencoded',
+            parse: (unparsedPayload) => {
+                const parse = options.querystring || Querystring.parse;
+                return unparsedPayload.length
+                    ? parse(unparsedPayload.toString('utf8'))
+                    : {};
+            },
+        },
+    ];
+
+    if (options.customParsers && options.customParsers.length) {
+        parsers = options.customParsers.concat(parsers);
     }

-    // JSON
-
-    if (/^application\/(?:.+\+)?json$/.test(mime)) {
-        if (!payload.length) {
-            return null;
+    const parser = parsers.find((parser) => {
+        if (parser.mime) {
+            if (parser.mime instanceof RegExp) {
+                return parser.mime.test(mime);
+            } else if (parser.mime instanceof String) {
+                return parser.mime === mime;
+            }
         }
+        return false;
+    });

+    if (parser) {
         try {
-            return Bourne.parse(payload.toString('utf8'), { protoAction: options.protoAction });
-        }
-        catch (err) {
-            const error = Boom.badRequest('Invalid request payload JSON format', err);
-            error.raw = payload;
-            throw error;
+            return await parser.parse(payload);
+        } catch (error) {
+            console.error(error);
+            const boomError = Boom.badRequest(`Error parsing data with content-type ${mime}, check content for errors!`)
+            boomError.raw = payload;
+            throw boomError;
         }
     }

-    // Form-encoded
-
-    if (mime === 'application/x-www-form-urlencoded') {
-        const parse = options.querystring ?? Querystring.parse;
-        return payload.length ? parse(payload.toString('utf8')) : {};
-    }
-
     const error = Boom.unsupportedMediaType();
     error.raw = payload;
     throw error;
@@ -384,7 +421,7 @@ internals.part = async function (part, output, set, options) {
     }

     try {
-        const object = internals.object(options, payload, mime);
+        const object = await internals.object(options, payload, mime);
         annotate(object);
     }
     catch (err) {

This issue body was partially generated by patch-package.

SK-FComputer commented 2 years ago

And this for @hapi/hapi:

diff --git a/node_modules/@hapi/hapi/lib/config.js b/node_modules/@hapi/hapi/lib/config.js
index e3768ae..cd6c44f 100755
--- a/node_modules/@hapi/hapi/lib/config.js
+++ b/node_modules/@hapi/hapi/lib/config.js
@@ -144,6 +144,7 @@ internals.routeBase = Validate.object({
     payload: Validate.object({
         output: Validate.valid('data', 'stream', 'file').default('data'),
         parse: Validate.boolean().allow('gunzip').default(true),
+        customParsers: Validate.array().items(Validate.object({mime: Validate.alternatives(Validate.object(), Validate.string()), parse: Validate.function()})),
         multipart: Validate.object({
             output: Validate.valid('data', 'stream', 'file', 'annotated').required()
         })
matthieusieben commented 2 years ago

Looks nice !

SK-FComputer commented 2 years ago

Has been nice for 1 year now, and still nothing.

image

matthieusieben commented 2 years ago

Did you make a pr ?

SK-FComputer commented 2 years ago

I was rejected at the time, and has since just used patch-package.

SK-FComputer commented 1 year ago

Almost another year went by and i still don't see a way to do this.. 2 years running now

I personally can't upgrade multiple projects because of this issue, and the time has almost come for a refactor of some. So if this isn't implemented soonish, i rather rewrite in some other API framework, than figure out what is breaking my patches