darold / pgFormatter

A PostgreSQL SQL syntax beautifier that can work as a console program or as a CGI. On-line demo site at http://sqlformat.darold.net/
PostgreSQL License
1.66k stars 100 forks source link

% character of format specifiers in dollar quoted strings is treated as mod operator #340

Open EnderDobra opened 3 months ago

EnderDobra commented 3 months ago

Formatter separates the % signs in unquoted format specifiers by adding a space after it.
This is my sql code:

create or replace procedure public.copy_into_table_partition(
 --some parameters
)
language plpgsql
as $body$
declare
--some declaration
begin
    --code 
    --code 
    execute format(
        $i$ --dollar quote start
            insert into public."%2$s_%5$s"(%4$s)
            select %4$s from %3$s 
            where date_time >= %1$L and date_time < (timestamp %1$L + interval '1 month')
            order by date_time
        $i$/*dollar quote end*/, partition_day, _table_name, old_table_name, column_list_as_text, _table_suffix);
    GET DIAGNOSTICS num_copied = ROW_COUNT;
    raise notice 'Copied % rows to %', num_copied, format('public."%2$s_%1$s"', _table_suffix, _table_name);
end;
$body$;

This is the formatted code:

--#region public.copy_into_table_partition
create or replace procedure public.copy_into_table_partition(
 --some parameters
)
language plpgsql
as $body$
declare
--some declaration
begin
    --code 
    --code 
    execute format($i$ /*dollar quote start*/ insert into public. "%2$s_%5$s"(% 4$s)
        select
            % 4$s from % 3$s
            where
                date_time >= % 1$L
                and date_time <(timestamp % 1$L + interval '1 month')
            order by date_time $i$/*dollar quote end*/, partition_day, _table_name, old_table_name, column_list_as_text, _table_suffix);
    GET DIAGNOSTICS num_copied = ROW_COUNT;
    raise notice 'Copied % rows to %', num_copied, format('public."%2$s_%1$s"', _table_suffix, _table_name);
end;
$body$;

--#endregion public.copy_into_table_partition

After some debugging I've managed to make it stop by changing this regex pattern in Beautify.pm line 609:

sub tokenize_sql
{
my $re = qr{
(?:<>|<=>|>=|<=|=>|==|!=|:=|=|!|<<|>>|<|>|\|\||\||&&|&|\-|\+|\*(?!/)|/(?!\*)|\%|~|\^|\?) # operators and tests
}ismx;
}

with this

sub tokenize_sql
{
my $re = qr{
# (?:<>|<=>|>=|<=|=>|==|!=|:=|=|!|<<|>>|<|>|\|\||\||&&|&|\-|\+|\*(?!/)|/(?!\*)|\%|~|\^|\?) # operators and tests
(?:<>|<=>|>=|<=|=>|==|!=|:=|=|!|<<|>>|<|>|\|\||\||&&|&|\-|\+|\*(?!/)|/(?!\*)|\%(?!\d+\$[sL])|~|\^|\?) # operators and tests
}ismx;
}

I have literally zero experience with perl and very little experience with regex but initial fiddling shows that the modification is working. I'll see if it still works in the next few days as I work.

Just wanted to let you know. You can decide whether this is an issue or not for yourself. This is my first time posting a github issue so if there's anything that I've done wrong feel free to roast me.

edit: Just noticed that it also separates quoted strings with % character in them after dot character. I've removed the percent characters in these patterns and it stopped separating them:

"[^\"\s\(\)=<>!~\*&:\|\-\+\%\^\?\@\#\[\]\{\}\.,;']+"\.[^\"\s\(\)=<>!~\*&:\|\-\+\%\^\?\@\#\[\]\{\}\.,;']+ # fqdn identifier form "schema".table or "table".column
|
[^\"\s=<>!~\*&\(\):\|\-\+\%\^\?\@\#\[\]\{\}\.,;']+\."[^\"\s=<>!~\*&\(\):\|\-\+\%\^\?\@\#\[\]\{\}\.,;']+" # fqdn identifier form schema."table" or table."column"
|
"[^\"\s=<>!~\*&\(\):\|\-\+\%\^\?\@\#\[\]\{\}\.,;']+"\."[^\"\s=<>!~\*&\(\):\|\-\+\%\^\?\@\#\[\]\{\}\.,;']+" # fqdn identifier form "schema"."table" or "table"."column"
EnderDobra commented 3 months ago

Hi, just found this issue #257 which seems relative to my issue. For now, adding --separator parameter doesn't seem to make a difference.