deliverr / data-knex-snowflake-dialect

knex.js dialect for the Snowflake data warehouse
MIT License
13 stars 26 forks source link

Support for variant / semi-structured / JSON columns #13

Open mingp opened 3 years ago

mingp commented 3 years ago

Snowflake supports a column type called variant, which supports JSON and other JSON-like semi-structured data. One syntax to be able to index a variant column is by using the colon character, what the Snowflake documentation calls dot notation, per docs here.

As far as I can tell, there is no current support for this notation. If, in a where, I try to specify a field as this dot notation, e.g. foo:bar, then the query compiler mistakenly tries to quote this as "foo:bar", when the correct quotation is supposed to be "foo":"bar". This results in an error.

I've found that Knex supports a custom wrapIdentifier, per docs here. This is definitely one possible work-around. As far as I can tell, knex-snowflake-dialect already supplies its own wrapIdentifier implementation, but it doesn't support dot notation. Hence, the above issue.

I'm working on seeing if I can get a work-around to work locally. Perhaps, however, this should be integrated into the library.

Let me know your thoughts. Thanks.

emurphy commented 3 years ago

Thanks again for your input @mingp. Our knex Snowflake queries have been simple thus far, and so it would not surprise me if variant dot notation is not supported. Will appreciate if you can add a unit test and fix for it. 👍

The wrapIdentifier override may be the way to go.

mingp commented 3 years ago

My current work-around, using wrapIdentifier, is as follows.

        const wrapIdentifier = (
            value: string,
            _origImpl: (value: string) => string,
            _queryContext: any,
        ): string => {
            if (value === '*') {
                return value;
            }
            const maybeMatchColonSyntax = /([A-Za-z0-9_]+):([A-Za-z0-9_]+)/.exec(value);
            if (maybeMatchColonSyntax) {
                return `"${maybeMatchColonSyntax[1]}":"${maybeMatchColonSyntax[2]}"`;
            }
            return `"${value}"`;
        };

This seems to work correctly for all the queries I am using, but I doubt it is general-case correct or covers all corner cases.