Segfault-Inc / Multicorn

Data Access Library
https://multicorn.org/
PostgreSQL License
701 stars 145 forks source link

sqlalchemy_fdw pyodbc error from PG9.5 to SQLServer 2008 #133

Closed jkort closed 5 years ago

jkort commented 8 years ago

Hello, trying to connect via fdw from a PG 9.5 database under Ubuntu 14.04 to a remote SQLServer 2008R2SP1 database with hundreds of tables and views over a VPN. I have no control over the SQLServer db; it belongs to an external party.

I was successful in getting FreeTDS to work (tested with queries under tsql), able to query my remote SQLServer database using pyodbc, and pyodbc with sqlalchemy under python 2.7.6. So I feel good about my odbc.ini, odbcinst.ini and freetds.conf files.

To be clear - I am inexperienced with postgres, python, freetds and all of these tools.

When I try to use sqlalchemy_fdw, that is where I'm getting stuck. I think my CREATE SERVER statement is ok, as I can get connection errors back if I deliberately get the password wrong. Here is what I have tried (with dummy strings for sensitive information):

drop schema XYZrpt cascade;
create schema if not exists XYZrpt;

DROP SERVER IF EXISTS XYZrpt_svr CASCADE;

create server XYZrpt_svr foreign data wrapper multicorn 
   options (
    wrapper 'multicorn.sqlalchemyfdw.SqlAlchemyFdw', 
    db_url 'mssql+pyodbc://XXXXXXXX:YYYYYYYY@MedidataVPN'
    );

import foreign schema "ABC.dbo" from SERVER XYZrpt_svr into XYZrpt;

where ABC is a database name on the SQLServer. The dbo schema has perhaps hundreds of tables, and this ran for over 12 minutes before I got this error back:

ERROR:  Error in python: TypeError
DETAIL:  %d format: a number is required, not str
********** Error **********

ERROR: Error in python: TypeError
SQL state: XX000
Detail: %d format: a number is required, not str

Next I tried to limit to just a single table that I know is in that ABC database, dbo.Audits, by using this IMPORT FOREIGN SCHEMA statement:

import foreign schema "ABC.dbo" LIMIT TO ("Audits") from SERVER XYZrpt_svr into XYZrpt;

and that returned this error:

ERROR:  Error in python: UnsupportedCompilationError
DETAIL:  Compiler <sqlalchemy.dialects.postgresql.base.PGTypeCompiler object at 0x7fab8e7e5350> can't render element of type <class 'sqlalchemy.dialects.mssql.base.TINYINT'>
********** Error **********

ERROR: Error in python: UnsupportedCompilationError
SQL state: XX000
Detail: Compiler <sqlalchemy.dialects.postgresql.base.PGTypeCompiler object at 0x7fab8e7e5350> can't render element of type <class 'sqlalchemy.dialects.mssql.base.TINYINT'>

there is a TinyInt field in that SQL Server table.

Any suggestions on what I should do differently would be greatly appreciated! If I can provide more information to help determine the cause of the error, please let me know.

Best Regards, -Jack

rastkok commented 8 years ago

Hello, I was able to reproduce second error (Compiler can't render element of type) when importing specific table that contains TinyInt field. Please test if this patch fixes this problem. Patch maps MSSQL TinyInt type to PostgreSQL SMALLINT type.

Best Regards, Rastko

jkort commented 8 years ago

Thank you Rastko! I carefully made the changes you indicated in sqlalchemyfdw.py, and tried that same LIMITed IMPORT FOREIGN SCHEMA again. I no longer get the TinyInt error, but I get this now:

ERROR:  type "nvarchar" does not exist
LINE 3: "Property" NVARCHAR(50),
                   ^
QUERY:  CREATE FOREIGN TABLE "XYZrpt"."Audits" (
"AuditID" INTEGER,
"Property" NVARCHAR(50),
"Value" NVARCHAR(2048),
"Readable" NVARCHAR(2200),
"AuditUserID" INTEGER,
"AuditTime" DATETIME,
"ObjectID" INTEGER,
"Guid" CHAR(36),
"ServerSyncDate" DATETIME,
"ObjectTypeID" SMALLINT,
"AuditSubCategoryID" INTEGER,
"DatabaseTime" DATETIME

 ) SERVER "XYZrpt_svr" 
 OPTIONS (
primary_key 'AuditID',
schema 'XYZRpt.dbo',
tablename 'Audits'
)
CONTEXT:  importing foreign table "Audits"
********** Error **********

ERROR: type "nvarchar" does not exist
SQL state: 42704
Context: importing foreign table "Audits"
rastkok commented 8 years ago

There is no NVARCHAR type in PostgreSQL, so you should change NVARCHAR to VARCHAR in your foreign table declaration.

Best Regards, Rastko

jkort commented 8 years ago

Forgive my ignorance - could I do something similar to the CONVERSION_MAP line you added to sqlalchemyfdw.py (mssql_dialect.TINYINT: SMALLINT) but substitute NVARCHAR and VARCHAR ?

Thanks for your help! -Jack

jkort commented 8 years ago

Adding on to Rastko's patch to sqlalchemyfdw.py, I included VARCHAR, TIMESTAMP here:

from sqlalchemy.dialects.postgresql.base import ( ARRAY, ischema_names, PGDialect, NUMERIC, SMALLINT, VARCHAR, TIMESTAMP)

then added the last 2 lines to his modified conversion map:

CONVERSION_MAP = { oracle_dialect.NUMBER: NUMERIC, mssql_dialect.TINYINT: SMALLINT, mssql_dialect.NVARCHAR: VARCHAR, mssql_dialect.DATETIME: TIMESTAMP }

and now this seems to work:

import foreign schema "ABC.dbo" LIMIT TO ("Audits") from SERVER XYZrpt_svr into XYZrpt;

although I have to quote my table name and column names.

Is what I've done proper? Not really comfortable making these changes!

Thanks! -Jack

jkort commented 8 years ago

As a follow-on to my last comment - after my changes, I tried to import the full foreign schema again (removed the LIMIT TO clause) and still, after about 10 minutes I get that error as before:

ERROR: Error in python: TypeError DETAIL: %d format: a number is required, not str \ Error **

ERROR: Error in python: TypeError SQL state: XX000 Detail: %d format: a number is required, not str

rastkok commented 8 years ago

Can you please check if there is more info in postgres log?

-Rastko

jkort commented 8 years ago

Hi Rastko, this is what I see in the postgres main log:

2016-02-26 13:21:40 PST ERROR: Error in python: TypeError 2016-02-26 13:21:40 PST DETAIL: Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/multicorn-1.3.2-py2.7-linux-x86_64.egg/multicorn/sqlalchemyfdw.py", line 503, in import_schema
    type_name=c.type.compile(dialect)))

  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/type_api.py", line 468, in compile
    return dialect.type_compiler.process(self)

  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 261, in process
    return type_._compiler_dispatch(self, **kw)

  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 81, in _compiler_dispatch
    return meth(self, **kw)

  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 2546, in visit_VARBINARY
    return "VARBINARY" + (type_.length and "(%d)" % type_.length or "")

TypeError: %d format: a number is required, not str

2016-02-26 13:21:40 PST STATEMENT: import foreign schema "XYZRpt.dbo" from SERVER XYZrpt_svr into XYZrpt;

rastkok commented 8 years ago

Which version of sqlalchemy are you using? Could you try importing table with varbinary(max) field and see if you get the same error?

-Rastko

rdunklau commented 8 years ago

Thank you to both of you for figuring the needed types conversions. I think I was a bit overoptimistic in the ability to match different types from different databases.

The way its done now is quite simplistic, maybe there should be an additional way to control the conversions.

varbinary should probably be mapped to a bytea object.

jkort commented 8 years ago

Hello Rastko and Ronan,

As with the patches above, I added a VARBINARY to BYTEA mapping:

from sqlalchemy.dialects.postgresql.base import ( ARRAY, ischema_names, PGDialect, NUMERIC, SMALLINT, VARCHAR, TIMESTAMP, BYTEA)

CONVERSION_MAP = { oracle_dialect.NUMBER: NUMERIC, mssql_dialect.TINYINT: SMALLINT, mssql_dialect.NVARCHAR: VARCHAR, mssql_dialect.DATETIME: TIMESTAMP, mssql_dialect.VARBINARY: BYTEA }

I restarted postgres and tried to IMPORT FOREIGN SCHEMA again without any LIMIT; after many minutes I got this error:

ERROR: Error in python: AttributeError DETAIL: 'BIT' object has no attribute 'varying' \ Error **

ERROR: Error in python: AttributeError SQL state: XX000 Detail: 'BIT' object has no attribute 'varying'

In the postgres main log I see:

import foreign schema "XYZRpt.dbo"  from SERVER XYZrpt_svr into XYZrpt;

2016-02-29 11:53:16 PST ERROR: Error in python: AttributeError 2016-02-29 11:53:16 PST DETAIL: Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/multicorn-1.3.2-py2.7-linux-x86_64.egg/multicorn/sqlalchemyfdw.py", line 504, in import_schema
    type_name=c.type.compile(dialect)))

  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/type_api.py", line 468, in compile
    return dialect.type_compiler.process(self)

  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/compiler.py", line 261, in process
    return type_._compiler_dispatch(self, **kw)

  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/sql/visitors.py", line 81, in _compiler_dispatch
    return meth(self, **kw)

  File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/dialects/postgresql/base.py", line 1818, in visit_BIT
    if type_.varying:

AttributeError: 'BIT' object has no attribute 'varying'

Rastko, to your question - I'm not sure how I check my version of sqlalchemy, but I will look into it. I just downloaded last week, so I would hope it is current.

Thanks! -Jack

rastkok commented 8 years ago

Hi Jack,

  1. MSSQLs BIT type should be mapped to PSQLs BOOLEAN type.
  2. To get sqlalchemy version on Ubuntu execute: dpkg -s python-sqlalchemy | grep Version

Regards, -Rastko