babelfish-for-postgresql / babelfish_extensions

Babelfish for PostgreSQL provides the capability for PostgreSQL to work with applications written for Microsoft SQL Server. Babelfish understands the SQL Server wire-protocol and T-SQL, the Microsoft SQL Server query and procedural language, so you don’t have to switch database drivers or rewrite all of your application queries.
https://babelfishpg.org/
Apache License 2.0
275 stars 93 forks source link

[Bug]: Wrong overloaded variant of regexp_replace is called during restore #2944

Open staticlibs opened 4 weeks ago

staticlibs commented 4 weeks ago

What happened?

With the following T-SQL DB:

create database test1
use test1
create table tab1 (
    col1 varchar(8),
    col2 as replace(col1, ' ', '')
)
insert into tab1 (col1) values ('foo bar')

If we dump it with pg_dump:

pg_dump -U jdbc_user -d jdbc_testdb --bbf-database-name test1 > test1.sql

Then it will fail on restore the following way:

psql -U jdbc_user -d jdbc_testdb -f ./test1.sql
[...]
psql:test1.sql:206: ERROR:  invalid input syntax for type integer: "ig"
LINE 1: ..._replace(input_string, '***=' || pattern, replacement, 'ig')
                                                                  ^
QUERY:  regexp_replace(input_string, '***=' || pattern, replacement, 'ig')
CONTEXT:  PL/pgSQL function sys.replace(sys."varchar",sys."varchar",sys."varchar") line 8 at RETURN
COPY tab1, line 1: "foo bar"

This happens when the data is copied into the following table:

CREATE TABLE test1_dbo.tab1 (
    col1 sys."varchar"(8),
    col2 sys."varchar" GENERATED ALWAYS AS (sys.replace(col1, ' '::sys."varchar", ''::sys."varchar")) STORED
)
COPY test1_dbo.tab1 (col1) FROM stdin;
foo bar
\.

In sys.replace the call to regexp_replace for some reason is choosing the following overloaded variant:

regexp_replace(source, pattern, replacement, start, N)

instead of the expected one:

regexp_replace(source, pattern, replacement, flags)

So it tries to parse string ig as an integer and fails on type conversion.

As a workaround it is possible to force the correct overload by changing the regexp_replace call to the following:

return regexp_replace(input_string, '***=' || pattern, replacement, 'ig'::text);

But perhaps there exist more general solution to this.

Version

BABEL_4_X_DEV (Default)

Extension

babelfishpg_tsql (Default)

Which flavor of Linux are you using when you see the bug?

Fedora

Relevant log output

pg_strtoint32_safe(const char * s, Node * escontext) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/utils/adt/numutils.c:617)
int4in(FunctionCallInfo fcinfo) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/utils/adt/int.c:291)
InputFunctionCall(FmgrInfo * flinfo, char * str, Oid typioparam, int32 typmod) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/utils/fmgr/fmgr.c:1708)
OidInputFunctionCall(Oid functionId, char * str, Oid typioparam, int32 typmod) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/utils/fmgr/fmgr.c:1920)
stringTypeDatum(Type tp, char * string, int32 atttypmod) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/parse_type.c:677)
coerce_type(ParseState * pstate, Node * node, Oid inputTypeId, Oid targetTypeId, int32 targetTypeMod, CoercionContext ccontext, CoercionForm cformat, int location) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/parse_coerce.c:351)
make_fn_arguments(ParseState * pstate, List * fargs, Oid * actual_arg_types, Oid * declared_arg_types) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/parse_func.c:1925)
ParseFuncOrColumn(ParseState * pstate, List * funcname, List * fargs, Node * last_srf, FuncCall * fn, _Bool proc_call, int location) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/parse_func.c:686)
transformFuncCall(ParseState * pstate, FuncCall * fn) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/parse_expr.c:1543)
transformExprRecurse(ParseState * pstate, Node * expr) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/parse_expr.c:220)
transformExpr(ParseState * pstate, Node * expr, ParseExprKind exprKind) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/parse_expr.c:121)
transformTargetEntry(ParseState * pstate, Node * node, Node * expr, ParseExprKind exprKind, char * colname, _Bool resjunk) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/parse_target.c:102)
transformTargetList(ParseState * pstate, List * targetlist, ParseExprKind exprKind) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/parse_target.c:193)
transformSelectStmt(ParseState * pstate, SelectStmt * stmt) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/analyze.c:1473)
transformStmt(ParseState * pstate, Node * parseTree) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/analyze.c:412)
transformOptionalSelectInto(ParseState * pstate, Node * parseTree) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/analyze.c:347)
transformTopLevelStmt(ParseState * pstate, RawStmt * parseTree) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/analyze.c:297)
parse_analyze_withcb(RawStmt * parseTree, const char * sourceText, ParserSetupHook parserSetup, void * parserSetupArg, QueryEnvironment * queryEnv) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/parser/analyze.c:244)
pg_analyze_and_rewrite_withcb(RawStmt * parsetree, const char * query_string, ParserSetupHook parserSetup, void * parserSetupArg, QueryEnvironment * queryEnv) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/tcop/postgres.c:790)
_SPI_prepare_plan(const char * src, SPIPlanPtr plan) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/executor/spi.c:2295)
SPI_prepare_extended(const char * src, const SPIPrepareOptions * options) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/executor/spi.c:939)
plpgsql.so!exec_prepare_plan(PLpgSQL_execstate * estate, PLpgSQL_expr * expr, int cursorOptions) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_exec.c:4214)
plpgsql.so!exec_eval_expr(PLpgSQL_execstate * estate, PLpgSQL_expr * expr, _Bool * isNull, Oid * rettype, int32 * rettypmod) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_exec.c:5717)
plpgsql.so!exec_stmt_return(PLpgSQL_execstate * estate, PLpgSQL_stmt_return * stmt) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_exec.c:3306)
plpgsql.so!exec_stmts(PLpgSQL_execstate * estate, List * stmts) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_exec.c:2072)
plpgsql.so!exec_stmt_if(PLpgSQL_execstate * estate, PLpgSQL_stmt_if * stmt) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_exec.c:2531)
plpgsql.so!exec_stmts(PLpgSQL_execstate * estate, List * stmts) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_exec.c:2036)
plpgsql.so!exec_stmt_block(PLpgSQL_execstate * estate, PLpgSQL_stmt_block * block) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_exec.c:1943)
plpgsql.so!exec_toplevel_block(PLpgSQL_execstate * estate, PLpgSQL_stmt_block * block) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_exec.c:1634)
plpgsql.so!plpgsql_exec_function(PLpgSQL_function * func, FunctionCallInfo fcinfo, EState * simple_eval_estate, ResourceOwner simple_eval_resowner, ResourceOwner procedure_resowner, _Bool atomic) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_exec.c:623)
plpgsql.so!plpgsql_call_handler(FunctionCallInfo fcinfo) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/pl/plpgsql/src/pl_handler.c:277)
fmgr_security_definer(FunctionCallInfo fcinfo) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/utils/fmgr/fmgr.c:870)
ExecInterpExpr(ExprState * state, ExprContext * econtext, _Bool * isnull) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/executor/execExprInterp.c:758)
ExecInterpExprStillValid(ExprState * state, ExprContext * econtext, _Bool * isNull) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/executor/execExprInterp.c:1870)
ExecEvalExpr(ExprState * state, ExprContext * econtext, _Bool * isNull) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/include/executor/executor.h:344)
ExecComputeStoredGenerated(ResultRelInfo * resultRelInfo, EState * estate, TupleTableSlot * slot, CmdType cmdtype) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/executor/nodeModifyTable.c:515)
CopyFrom(CopyFromState cstate) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/commands/copyfrom.c:1166)
DoCopy(ParseState * pstate, const CopyStmt * stmt, int stmt_location, int stmt_len, uint64 * processed) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/commands/copy.c:318)
standard_ProcessUtility(PlannedStmt * pstmt, const char * queryString, _Bool readOnlyTree, ProcessUtilityContext context, ParamListInfo params, QueryEnvironment * queryEnv, DestReceiver * dest, QueryCompletion * qc) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/tcop/utility.c:749)
babelfishpg_tds.so!call_next_ProcessUtility(PlannedStmt * pstmt, const char * queryString, _Bool readOnlyTree, ProcessUtilityContext context, ParamListInfo params, QueryEnvironment * queryEnv, DestReceiver * dest, QueryCompletion * completionTag) (/home/alex/projects/postgres/dev/babelfish_extensions/contrib/babelfishpg_tds/src/backend/tds/tdsutils.c:756)
babelfishpg_tds.so!tdsutils_ProcessUtility(PlannedStmt * pstmt, const char * queryString, _Bool readOnlyTree, ProcessUtilityContext context, ParamListInfo params, QueryEnvironment * queryEnv, DestReceiver * dest, QueryCompletion * completionTag) (/home/alex/projects/postgres/dev/babelfish_extensions/contrib/babelfishpg_tds/src/backend/tds/tdsutils.c:695)
pg_stat_statements.so!pgss_ProcessUtility(PlannedStmt * pstmt, const char * queryString, _Bool readOnlyTree, ProcessUtilityContext context, ParamListInfo params, QueryEnvironment * queryEnv, DestReceiver * dest, QueryCompletion * qc) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/contrib/pg_stat_statements/pg_stat_statements.c:1141)
babelfishpg_tsql.so!call_prev_ProcessUtility(PlannedStmt * pstmt, const char * queryString, _Bool readOnlyTree, ProcessUtilityContext context, ParamListInfo params, QueryEnvironment * queryEnv, DestReceiver * dest, QueryCompletion * qc) (/home/alex/projects/postgres/dev/babelfish_extensions/contrib/babelfishpg_tsql/src/pl_handler.c:4244)
babelfishpg_tsql.so!bbf_ProcessUtility(PlannedStmt * pstmt, const char * queryString, _Bool readOnlyTree, ProcessUtilityContext context, ParamListInfo params, QueryEnvironment * queryEnv, DestReceiver * dest, QueryCompletion * qc) (/home/alex/projects/postgres/dev/babelfish_extensions/contrib/babelfishpg_tsql/src/pl_handler.c:4216)
ProcessUtility(PlannedStmt * pstmt, const char * queryString, _Bool readOnlyTree, ProcessUtilityContext context, ParamListInfo params, QueryEnvironment * queryEnv, DestReceiver * dest, QueryCompletion * qc) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/tcop/utility.c:529)
PortalRunUtility(Portal portal, PlannedStmt * pstmt, _Bool isTopLevel, _Bool setHoldSnapshot, DestReceiver * dest, QueryCompletion * qc) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/tcop/pquery.c:1158)
PortalRunMulti(Portal portal, _Bool isTopLevel, _Bool setHoldSnapshot, DestReceiver * dest, DestReceiver * altdest, QueryCompletion * qc) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/tcop/pquery.c:1315)
PortalRun(Portal portal, long count, _Bool isTopLevel, _Bool run_once, DestReceiver * dest, DestReceiver * altdest, QueryCompletion * qc) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/tcop/pquery.c:791)
exec_simple_query(const char * query_string) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/tcop/postgres.c:1283)
PostgresMain(const char * dbname, const char * username) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/tcop/postgres.c:4758)
libpq_mainfunc(Port * port) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/postmaster/postmaster.c:1542)
BackendRun(Port * port) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/postmaster/postmaster.c:4641)
BackendStartup(Port * port) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/postmaster/postmaster.c:4368)
ServerLoop() (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/postmaster/postmaster.c:1953)
PostmasterMain(int argc, char ** argv) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/postmaster/postmaster.c:1499)
main(int argc, char ** argv) (/home/alex/projects/postgres/dev/postgresql_modified_for_babelfish/src/backend/main/main.c:198)

Code of Conduct

tanscorpio7 commented 1 week ago

Babelfish gives preference to functions with argument of type INT over VARCHAR when trying to select best function when input values for arguments is of unknown type. tsql_has_func_args_higher_precedence
tsql_precedence_infos[]

Casting the last argument in regexp_replace() to TEXT explicitly in sys.replace() to fix the dump restore issue for sys.replace() seems like the right fix.

suprio-amzn commented 4 days ago

@staticlibs: It seems you have use pg_dump directly. Shouldn't you be using bbf_dump?

staticlibs commented 3 days ago

If I am reading the spec file correctly, bbf_dump is a pg_dump built from the postgresql_modified_for_babelfish tree (without additional patches). In the example above I've used this freshly built pg_dump modified for Babelfish.