Open pob8888 opened 10 months ago
Facing a similar issue. My set up is slightly different, but the symptom seems to be the same.
I'm running in a docker container, so I didn't do a fresh build locally. My Dockerfile is very basic:
FROM dimitri/pgloader:latest
I have a subset of tables I want to migrate from sqlserver (2016) -> postgres. The source and target DBs are hosted in docker containers as well.
This example only has a single table, as it reproduces the error consistently (my ms.load file can be seen in the log below). I'm setting a fetch limit to avoid hitting out of heap errors.
I run pgloader with -d and -v to see what information i could get:
# uname -a
Linux 01d1a0a03d1c 5.10.102.1-microsoft-standard-WSL2 #1 SMP Wed Mar 2 00:30:59 UTC 2022 x86_64 GNU/Linux
# pgloader --version
pgloader version "3.6.7~devel"
compiled with SBCL 2.1.1.debian
# pgloader -d -v ms.load
pgloader version 3.6.7~devel
compiled with SBCL 2.1.1.debian
sb-impl::*default-external-format* :UTF-8
tmpdir: #P"/tmp/pgloader/"
2024-01-24T23:30:27.000000Z NOTICE Starting pgloader, log system is ready.
2024-01-24T23:30:27.010000Z INFO Starting monitor
2024-01-24T23:30:27.010000Z LOG pgloader version "3.6.7~devel"
2024-01-24T23:30:27.020000Z INFO Parsed command:
load database
from mssql://<user>@<host>/<src_db>
into pgsql://<user>@<host>/<dst_db>
including only table names like 'address' in schema 'application'
WITH prefetch rows = 10000
;
2024-01-24T23:30:27.090000Z DEBUG CONNECTED TO #<PGLOADER.PGSQL:PGSQL-CONNECTION pgsql://<user>@<host>/<dst_db> {1007E429A3}>
2024-01-24T23:30:27.090000Z DEBUG SET client_encoding TO 'utf8'
2024-01-24T23:30:27.090000Z DEBUG SET application_name TO 'pgloader'
2024-01-24T23:30:27.100000Z LOG Migrating from #<MSSQL-CONNECTION mssql://<user>@<host>/<src_db> {1007E41EE3}>
2024-01-24T23:30:27.100000Z LOG Migrating into #<PGSQL-CONNECTION pgsql://<user>@<host>/<dst_db> {1007E429A3}>
Max connections reached, increase value of TDS_MAX_CONN
2024-01-24T23:30:27.120000Z SQL MSSQL: sending query: -- params: dbname
-- table-type-name
-- including
-- filter-list-to-where-clause including
-- excluding
-- filter-list-to-where-clause excluding
select c.TABLE_SCHEMA,
c.TABLE_NAME,
c.COLUMN_NAME,
c.DATA_TYPE,
CASE
WHEN c.COLUMN_DEFAULT LIKE '((%' AND c.COLUMN_DEFAULT LIKE '%))' THEN
CASE
WHEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4) = 'newid()' THEN 'GENERATE_UUID'
WHEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4) LIKE 'convert(%varchar%,getdate(),%)' THEN 'today'
WHEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4) = 'getdate()' THEN 'CURRENT_TIMESTAMP'
WHEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4) LIKE '''%''' THEN SUBSTRING(c.COLUMN_DEFAULT,4,len(c.COLUMN_DEFAULT)-6)
ELSE SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4)
END
WHEN c.COLUMN_DEFAULT LIKE '(%' AND c.COLUMN_DEFAULT LIKE '%)' THEN
CASE
WHEN SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2) = 'newid()' THEN 'GENERATE_UUID'
WHEN SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2) LIKE 'convert(%varchar%,getdate(),%)' THEN 'today'
WHEN SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2) = 'getdate()' THEN 'CURRENT_TIMESTAMP'
WHEN SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2) LIKE '''%''' THEN SUBSTRING(c.COLUMN_DEFAULT,3,len(c.COLUMN_DEFAULT)-4)
ELSE SUBSTRING(c.COLUMN_DEFAULT,2,len(c.COLUMN_DEFAULT)-2)
END
ELSE c.COLUMN_DEFAULT
END,
c.IS_NULLABLE,
COLUMNPROPERTY(object_id(c.TABLE_NAME), c.COLUMN_NAME, 'IsIdentity'),
c.CHARACTER_MAXIMUM_LENGTH,
c.NUMERIC_PRECISION,
c.NUMERIC_PRECISION_RADIX,
c.NUMERIC_SCALE,
c.DATETIME_PRECISION,
c.CHARACTER_SET_NAME,
c.COLLATION_NAME
from INFORMATION_SCHEMA.COLUMNS c
join INFORMATION_SCHEMA.TABLES t
on c.TABLE_SCHEMA = t.TABLE_SCHEMA
and c.TABLE_NAME = t.TABLE_NAME
where c.TABLE_CATALOG = 'axelerator_hannover'
and t.TABLE_TYPE = 'BASE TABLE'
and ((c.table_schema = 'application' and c.table_name LIKE 'address'))
order by c.table_schema, c.table_name, c.ordinal_position;
2024-01-24T23:30:27.150000Z SQL MSSQL: sending query: -- params: including
-- filter-list-to-where-clause including
-- excluding
-- filter-list-to-where-clause excluding
select schema_name(schema_id) as SchemaName,
o.name as TableName,
REPLACE(i.name, '.', '_') as IndexName,
co.[name] as ColumnName,
i.is_unique,
i.is_primary_key,
i.filter_definition
from sys.indexes i
join sys.objects o on i.object_id = o.object_id
join sys.index_columns ic on ic.object_id = i.object_id
and ic.index_id = i.index_id
join sys.columns co on co.object_id = i.object_id
and co.column_id = ic.column_id
where schema_name(schema_id) not in ('dto', 'sys')
and ((schema_name(schema_id) = 'application' and o.name LIKE 'address'))
order by SchemaName,
o.[name],
i.[name],
ic.is_included_column,
ic.key_ordinal;
2024-01-24T23:30:27.170000Z SQL MSSQL: sending query: -- params: dbname
-- including
-- filter-list-to-where-clause including
-- excluding
-- filter-list-to-where-clause excluding
SELECT
REPLACE(KCU1.CONSTRAINT_NAME, '.', '_') AS 'CONSTRAINT_NAME'
, KCU1.TABLE_SCHEMA AS 'TABLE_SCHEMA'
, KCU1.TABLE_NAME AS 'TABLE_NAME'
, KCU1.COLUMN_NAME AS 'COLUMN_NAME'
, KCU2.TABLE_SCHEMA AS 'UNIQUE_TABLE_SCHEMA'
, KCU2.TABLE_NAME AS 'UNIQUE_TABLE_NAME'
, KCU2.COLUMN_NAME AS 'UNIQUE_COLUMN_NAME'
, RC.UPDATE_RULE AS 'UPDATE_RULE'
, RC.DELETE_RULE AS 'DELETE_RULE'
FROM INFORMATION_SCHEMA.REFERENTIAL_CONSTRAINTS RC
JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU1
ON KCU1.CONSTRAINT_CATALOG = RC.CONSTRAINT_CATALOG
AND KCU1.CONSTRAINT_SCHEMA = RC.CONSTRAINT_SCHEMA
AND KCU1.CONSTRAINT_NAME = RC.CONSTRAINT_NAME
JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU2
ON KCU2.CONSTRAINT_CATALOG = RC.UNIQUE_CONSTRAINT_CATALOG
AND KCU2.CONSTRAINT_SCHEMA = RC.UNIQUE_CONSTRAINT_SCHEMA
AND KCU2.CONSTRAINT_NAME = RC.UNIQUE_CONSTRAINT_NAME
WHERE KCU1.ORDINAL_POSITION = KCU2.ORDINAL_POSITION
AND KCU1.TABLE_CATALOG = 'axelerator_hannover'
AND KCU1.CONSTRAINT_CATALOG = 'axelerator_hannover'
AND KCU1.CONSTRAINT_SCHEMA NOT IN ('dto', 'sys')
AND KCU1.TABLE_SCHEMA NOT IN ('dto', 'sys')
AND KCU2.TABLE_SCHEMA NOT IN ('dto', 'sys')
and ((kcu1.table_schema = 'application' and kcu1.table_name LIKE 'address'))
ORDER BY KCU1.CONSTRAINT_NAME, KCU1.ORDINAL_POSITION;
2024-01-24T23:30:27.180000Z ERROR MSSQL ERROR: %dbsqlexec fail
2024-01-24T23:30:27.180000Z LOG You might need to review the FreeTDS protocol version in your freetds.conf file, see http://www.freetds.org/userguide/choosingtdsprotocol.htm
2024-01-24T23:30:27.180000Z LOG report summary reset
table name errors read imported bytes total time read write
----------------- --------- --------- --------- --------- -------------- --------- ---------
fetch meta data 0 0 0 0.000s
----------------- --------- --------- --------- --------- -------------- --------- ---------
----------------- --------- --------- --------- --------- -------------- --------- ---------
2024-01-24T23:30:27.190000Z INFO Stopping monitor
#
I'm not an expert on sqlserver, but I believe there is an issue in the last query using kcu1 rather than KCU1 as the table alias. Running a simple query (just via SQL Server Management Studio):
select *
from INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU1
ORDER BY KCU1.CONSTRAINT_NAME
<returns rows>
select *
from INFORMATION_SCHEMA.KEY_COLUMN_USAGE KCU1
WHERE kcu1.table_name LIKE 'address'
Msg 4104, Level 16, State 1, Line 41
The multi-part identifier "kcu1.table_name" could not be bound.
Removing the including only table names like 'address' in schema 'application'
from the ms.load file removes this issue and my migration runs successfully.
[x] pgloader --version
[x] did you test a fresh compile from the source tree?
yes
[x] did you search for other similar issues?
yes - issue #1153 similar, but was not resolved/progressed
[x] how can I reproduce the bug?
I am expecting it to NOT load the table schema1.films_staging I am also expecting to see schema1.films_me, schema2.films, schema2.filmes_staging being copied
Instead, it is copying only schema1.films_staging - it is as though the exclude is inverted.
Ultimately, I am trying to achieve something like this config:
successfully copy all tables from schema1 and schema2, excluding any tables in schema1 or schema2 that match the string *_staging
Further explanation:
src database:
destination database before running pgloader:
run pgloader my.loader
destination database after running pgloader