tds-fdw / tds_fdw

A PostgreSQL foreign data wrapper to connect to TDS databases (Sybase and Microsoft SQL Server)
Other
377 stars 101 forks source link

Issue with getting the data from varchar column with special characters #337

Open bukem opened 1 year ago

bukem commented 1 year ago

Issue report

The following information is very important in order to help us to help you. Omission of the following details cause delays or could receive no attention at all.

Operating system

On recent GNU/Linux distributions, you can provide the content of the file /etc/os-release

NAME="Rocky Linux"
VERSION="9.2 (Blue Onyx)"
ID="rocky"
ID_LIKE="rhel centos fedora"
VERSION_ID="9.2"
PLATFORM_ID="platform:el9"
PRETTY_NAME="Rocky Linux 9.2 (Blue Onyx)"
ANSI_COLOR="0;32"
LOGO="fedora-logo-icon"
CPE_NAME="cpe:/o:rocky:rocky:9::baseos"
HOME_URL="https://rockylinux.org/"
BUG_REPORT_URL="https://bugs.rockylinux.org/"
SUPPORT_END="2032-05-31"
ROCKY_SUPPORT_PRODUCT="Rocky-Linux-9"
ROCKY_SUPPORT_PRODUCT_VERSION="9.2"
REDHAT_SUPPORT_PRODUCT="Rocky Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="9.2"

Version of tds_fdw

From a psql session, paste the outputs of running \dx

If you built the package from Git sources, also paste the output of running git log --source -n 1 on your git clone from a console

  Name   | Version |   Schema   |                                    Description
---------+---------+------------+-----------------------------------------------------------------------------------
 tds_fdw | 2.0.3   | public     | Foreign data wrapper for querying a TDS database (Sybase or Microsoft SQL Server)

Version of PostgreSQL

From a psql session, paste the output of running SELECT version();

PostgreSQL 15.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.3.1 20221121 (Red Hat 11.3.1-4), 64-bit

Version of FreeTDS

How to get it will depend on your Operating System and how you installes FreeTDS

From a console:

# rpm -qa|grep freetds
freetds-libs-1.3.3-1.el9.x86_64
freetds-1.3.3-1.el9.x86_64

Logs

Please capture the logs when the error you are reporting is happening, as well as commands with their outputs if you are reporting a problem build or installing

_For problems using tdsfdw on PostgreSQL how to do it will depend on your system, but if your PostgreSQL is installed on GNU/Linux, you will want to use tail -f with the log of the PostgreSQL cluster

For MSSQL you will need to use the SQL Server Audit Log

Replace this with the commands and outputs

Sentences, data structures, data

This will depend on the exact problem you are having and data privacy restrictions

However the more data you provide, the more likely we will be able to help

As a bare minimum, you should provide

Table structure on SQL Server side:

SELECT COLUMN_NAME, DATA_TYPE, CHARACTER_MAXIMUM_LENGTH , IS_NULLABLE
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'DOCS'

COLUMN_NAME     DATA_TYPE       CHARACTER_MAXIMUM_LENGTH        IS_NULLABLE
fISN    int     NULL    NO
fBODY   varchar 4000    NO

Foreign table definition:

CREATE FOREIGN TABLE docs_tmp (
    fisn int OPTIONS (column_name 'fISN'),
    fbody varchar(4000) OPTIONS (column_name 'fBODY'))
SERVER mssql_test
OPTIONS(schema_name 'dbo', table_name 'DOCS');

And it works fine if we select one row:

> select fisn, fbody from docs_tmp where fisn in (1057646590);
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed database context to 'Test'., Server: test_server, Process: , Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 5703, Msg state: 1, Msg: Changed language setting to us_english., Server: test_server, Process: , Line: 1, Level: 0
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed database context to 'Test'., Server: test_server, Process: , Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 5703, Msg state: 1, Msg: Changed language setting to us_english., Server: test_server, Process: , Line: 1, Level: 0
    fisn    | fbody
------------+-------
 1057646590 | \r   +
            |
(1 row)

and another row:

> select fisn, fbody from docs_tmp where fisn in (566385307);
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed database context to 'Test'., Server: test_server, Process: ,Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 5703, Msg state: 1, Msg: Changed language setting to us_english., Server: test_server, Process: , Line: 1, Level: 0
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed database context to 'Test'., Server: test_server, Process: ,Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 5703, Msg state: 1, Msg: Changed language setting to us_english., Server: test_server, Process: , Line: 1, Level: 0
   fisn    |   fbody
-----------+-----------
 566385307 | \r       +
           | CHECK:0\r+
           |
(1 row)

But it fails if we want to get two rows:

> select fisn, fbody from docs_tmp where fisn in (1057646590, 566385307);
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed database context to 'Test'., Server: test_server, Process: , Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 5703, Msg state: 1, Msg: Changed language setting to us_english., Server: test_server, Process: , Line: 1, Level: 0
NOTICE:  tds_fdw: Query executed correctly
NOTICE:  tds_fdw: Getting results
ERROR:  DB-Library error: DB #: 2403, DB Msg: Some character(s) could not be converted into client's character set.  Unconverted bytes were changed to question marks ('?'), OS #: 0, OS Msg: Success, Level: 4

I suppose it is caused by the special characters in the fBODY column, \r in my case. And it looks like a bug.

Besides that, I tried to replace it with the STRING_ESCAPE function, but looks like it is not possible to use functions in foreign tables:

> DROP FOREIGN TABLE docs_tmp;
CREATE FOREIGN TABLE docs_tmp (
    fisn int OPTIONS (column_name 'fISN'),
    fbody varchar(4000) OPTIONS (column_name 'STRING_ESCAPE(''fBODY'', ''json'')'))
SERVER mssql_test
OPTIONS(schema_name 'dbo', table_name 'DOCS');
DROP FOREIGN TABLE
CREATE FOREIGN TABLE

> select fisn, fbody from docs_tmp where fisn in (1057646590, 566385307);
NOTICE:  DB-Library notice: Msg #: 5701, Msg state: 2, Msg: Changed database context to 'Test'., Server: test_server, Process: , Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 5703, Msg state: 1, Msg: Changed language setting to us_english., Server: test_server, Process: , Line: 1, Level: 0
NOTICE:  DB-Library notice: Msg #: 207, Msg state: 1, Msg: Invalid column name 'STRING_ESCAPE('fBODY', 'json')'., Server: test_server, Process: , Line: 1, Level: 16
ERROR:  DB-Library error: DB #: 20018, DB Msg: General SQL Server error: Check messages from the SQL Server, OS #: -1, OS Msg: , Level: 16

I also tried to use query, but looks like pushdown on the WHERE clause doesn't work in that case and query execution time was infinite, so I'm not able to use this option:

DROP FOREIGN TABLE docs_tmp;
CREATE FOREIGN TABLE docs_tmp (
    fisn int OPTIONS (column_name 'fISN'),
    fbody varchar(4000) OPTIONS (column_name 'fBODY'))
SERVER mssql_test
OPTIONS(query 'SELECT fISN, fBODY FROM dbo.DOCS');