andialbrecht / sqlparse

A non-validating SQL parser module for Python
BSD 3-Clause "New" or "Revised" License
3.63k stars 685 forks source link

Dollar quoted strings (PostgreSQL) cannot follow an operator (e.g. `=$$Hello$$`) #763

Closed proddata closed 4 months ago

proddata commented 5 months ago

Describe the bug Dollar quoted strings (e.g. PostgreSQL) are not properly detected, when there is an operator directly preceding the dollar quoted string (e.g var=$$text$$). While according to PostgreSQL docs ...

A dollar-quoted string that follows a keyword or identifier must be separated from it by whitespace; otherwise the dollar quoting delimiter would be taken as part of the preceding identifier.

... this does not hold true for operators. i.e. SET application_name=$$Hello$$; is valid.

This seems to relate to https://github.com/andialbrecht/sqlparse/blob/f101546dafa921edfea5b3107731504665b758ea/sqlparse/keywords.py#L33 and the negative lookbehind ensuring that there's no non-whitespace character before the starting $. This potentially should be changed to

(r'((?<![\w\"\$])\$(?:[_A-ZÀ-Ü]\w*)?\$)[\s\S]*?\1', tokens.Literal)

which would filter out any keywords or identifiers.

To Reproduce

sqlparse.split('''update test set a=$$test;test$$;''');
['update test set a=$$test;', 'test$$;']

Expected behavior

['update test set a=$$test;test$$;']

Versions (please complete the following information):

Additional context Add any other context about the problem here.