Currently, deparse represents most string literals escaped only by single-quote doubling.
If special characters like backslashes occur, they're output literally, possibly resulting in invalid data. (How and whether those backslashes are interpreted is configuration-dependent).
Comment statements (COMMENT ON TABLE "foo" IS …) are, for some reason, the only string literals that are rendered with an attempt to use C-style "escape" strings (E'foo'), which consistently interpret backslash-escaping.
However, the rendering logic for those is completely wrong. It doesn't do the requisite quote doubling (or, alternatively, quote-escaping), and instead of doubling backslashes, it replaces them with the same number of them.
Effectively comments are rendered worse than unescaped, since they both still open for single-quote mangling, and additionally opened up for backslash interpolation.
That means that a validly written comment like:
COMMENT ON COLUMN "foo"."whatever" IS $$
Something blah, this data may have chars like '\n' and '\r' in it.
$$;
Gets rewritten as:
COMMENT ON COLUMN "foo"."whatever" IS E'
Something blah, this column may have chars like '\n' and '\r' in it.
';
which is completely invalid syntax since the single quotes close on the single quote before the [\, n] chars.
If you change that text to '\n' to "\n" (e.g. to pretend that the single quotes had been correctly escaped), then it outputs as:
COMMENT ON COLUMN "foo"."whatever" IS E'
Something blah, this column may have chars like "\n" and "\r" in it.
';
This parses, but not as the original comment. When re-parsed, becomes:
COMMENT ON COLUMN "foo"."whatever" IS E'
Something blah, this column may have chars like "
" and "
" in it.
';
Instead, Deparser.escape should be written something like:
class Deparser {
// ...
escape(str: string): string {
// If the string contains backslashes, render an escape string using the E'…' syntax.
// Single quotes and backslashes are escaped with a backslash.
if (str.includes('\\')) {
return `E'${str.replaceAll(/[\\']/g, '\\$&')}'`;
}
// No backslashes: double any internal single quotes.
return `'${str.replaceAll(/'/g, '$&$&')}'`;
}
escapeNull(str: string | null | undefined): string {
return str == null ? 'NULL' : this.escape(str);
}
}
Currently,
deparse
represents most string literals escaped only by single-quote doubling. If special characters like backslashes occur, they're output literally, possibly resulting in invalid data. (How and whether those backslashes are interpreted is configuration-dependent).Comment statements (
COMMENT ON TABLE "foo" IS …
) are, for some reason, the only string literals that are rendered with an attempt to use C-style "escape" strings (E'foo'), which consistently interpret backslash-escaping.However, the rendering logic for those is completely wrong. It doesn't do the requisite quote doubling (or, alternatively, quote-escaping), and instead of doubling backslashes, it replaces them with the same number of them.
Effectively comments are rendered worse than unescaped, since they both still open for single-quote mangling, and additionally opened up for backslash interpolation.
That means that a validly written comment like:
Gets rewritten as:
which is completely invalid syntax since the single quotes close on the single quote before the [
\
,n
] chars.If you change that text to
'\n'
to"\n"
(e.g. to pretend that the single quotes had been correctly escaped), then it outputs as:This parses, but not as the original comment. When re-parsed, becomes:
Instead,
Deparser.escape
should be written something like:and the relevant lines in
CommentStmt
(https://github.com/launchql/pgsql-parser/blob/f1df82ed4358e47c682e007bc5aa306b58f25514/packages/deparser/src/deparser.ts#L1163-L1177) should be: