whitlockjc / json-refs

Various utilities for JSON Pointers (http://tools.ietf.org/html/rfc6901) and JSON References (http://tools.ietf.org/html/draft-pbryan-zyp-json-ref-03).
MIT License
223 stars 63 forks source link

JsonRefs.resolveRefs fails to dereference some URI encoded fragments #186

Closed wickedest closed 3 years ago

wickedest commented 3 years ago

Expected behavior JsonRefs.resolveRefs should correctly dereference strict RFC-3986 encoded URI fragments.

Actual behavior URI fragments containing reserved gen-delims and sub-delims RFC-3986 gen-delims characters do not seem to dereference correctly. The behavior is inconsistent, in some cases, a URI with a percent-encoded gen-delim will resolve (e.g. [), but another gen-delim (e.g. '@'), it will not. Similarly, for sub-delims, ( will resolve, but ! will not.

Whether or not the fragment should or should not be encoded in the first place is potentially arguable. I recall somewhere that Swagger requires strict RFC-3986 encoding. The issue is that some percent encoded characters are not being decoded when resolving the reference.

Steps to reproduce

Note that example1 demonstrates the issue, and that example2 just tries to get a sense of the restricted characters.

const JsonRefs = require('json-refs');

function encodeURIComponentRFC3986(str) {
    return encodeURIComponent(str).replace(/[!'()*]/g, function(c) {
        return '%' + c.charCodeAt(0).toString(16);
    });
}

async function example1() {
    const schema = {
        entity: {
            schema1: {
                $ref: '#/definitions/%40other'
            },
            schema2: {
                $ref: '#/definitions/@oth%65r'
            }
        },
        definitions: {
            '@other': {
                type: 'string'
            },
            '?other': {
                type: 'string'
            }
        }
    };
    await JsonRefs.resolveRefs(schema).then(res => {
        console.log(res.resolved.entity);
    }).catch(console.error);
}

async function example2() {
    const gendelims = [':', '/', '?', '#', '[', ']', '@'];
    const subdelims = ['!', '$', '&', '\'', '(', ')', '*', '+', ',', ';', '='];

    [...gendelims, ...subdelims].forEach(async delim => {
        const edelim = encodeURIComponentRFC3986(delim);
        const schema = {
            entity: {
                schema: {
                    $ref: `#/definitions/${edelim}other`
                }
            },
            definitions: {
                [`${delim}other`]: {
                    type: 'string'
                }
            }
        };
        await JsonRefs.resolveRefs(schema).then(res => {
            console.log(
                `delim: '${delim}' (${edelim})`,
                gendelims.includes(delim) ? 'gen-delim' : 'sub-delim',
                'is resolved OK:',
                !res.resolved.entity.schema.hasOwnProperty('$ref')
            );
        }).catch(console.error)
    });
}

(async () => {
    await example1();
    await example2();
})();

Outputs:

{
  schema1: { '$ref': '#/definitions/%40other' },
  schema2: { type: 'string' }
}
delim: ':' (%3A) gen-delim is resolved OK: false
delim: '/' (%2F) gen-delim is resolved OK: false
delim: '?' (%3F) gen-delim is resolved OK: false
delim: '#' (%23) gen-delim is resolved OK: false
delim: '@' (%40) gen-delim is resolved OK: false
delim: '$' (%24) sub-delim is resolved OK: false
delim: '&' (%26) sub-delim is resolved OK: false
delim: '+' (%2B) sub-delim is resolved OK: false
delim: ',' (%2C) sub-delim is resolved OK: false
delim: ';' (%3B) sub-delim is resolved OK: false
delim: '=' (%3D) sub-delim is resolved OK: false
delim: '[' (%5B) gen-delim is resolved OK: true
delim: ']' (%5D) gen-delim is resolved OK: true
delim: '!' (%21) sub-delim is resolved OK: true
delim: ''' (%27) sub-delim is resolved OK: true
delim: '(' (%28) sub-delim is resolved OK: true
delim: ')' (%29) sub-delim is resolved OK: true
delim: '*' (%2a) sub-delim is resolved OK: true

Comments Digging around a bit, it appears uri-js does not always fully decode fragments (fair enough, it's a parser), so a fragment needs to be decoded if it's going to be used for dereferencing. However, parts of the library seem to use decodeURI on fragments instead of decodeURIComponent as I would expect. This looks suspicious too.

For example:

> decodeURI('#/definitions/%40other')
< "#/definitions/%40other"

Whereas:

> decodeURIComponent('#/definitions/%40other')
< "#/definitions/@other"