dimitri / pgloader

Migrate to PostgreSQL in a single command!
http://pgloader.io
Other
5.45k stars 548 forks source link

Pgloader 3.6.2 - "on error resume next" functionality doesn't work for Pgloader. #1557

Open bhanase opened 9 months ago

bhanase commented 9 months ago

Hi Team, While migrating Postgres.v11–v14, we found the following query issue with the internal PGloader 3.3.2 tool. As a result, we upgraded to PGLoader 3.6.2.

Following that, we encountered issues when loading from CSV files using the PGloader, and the job immediately ceased executing as a result of the error (contains ""). We attempted to proceed with the procedure by using the on error resume next command, but PGLoader version 3.6.2 is not allowing this command to function as intended.

While we can process the entire CSV file after an error occurs in PGloader version 3.3.2 with Postgres-v11, this feature is not supported in PGloader version 3.6.2 with Postgres-v14.

column def.adsrc does not exist QUERY: select nspname, relname, c.oid, attname, t.oid::regtype as type, case when atttypmod > 0 then atttypmod - 4 else null end as typmod, attnotnull, case when atthasdef then def.adsrc end as default from pg_class c join pg_namespace n on n.oid = c.relnamespace left join pg_attribute a on c.oid = a.attrelid join pg_type t on t.oid = a.atttypid and attnum > 0 left join pg_attrdef def on a.attrelid = def.adrelid and a.attnum = def.adnum

 where nspname !~ '^pg_' and n.nspname <> 'information_schema'
       and relkind = 'r'
       and ((n.nspname = 'tdw_stg' and c.relname ~ '^src_archive$'))

order by nspname, relname, attnum 2023-10-11T02:48:46.147000Z FATAL Failed to prepare target PostgreSQL table. 2023-10-11T02:48:46.147000Z FATAL Database error 42703: column def.adsrc does not exist QUERY:

The code below is for the on error continue next in PGloader load.csv, but it does not support the below WITH instructions when loading the CSV with the sample field ("OBC ZONE5- AREAN "N" ", ""MUSAFFAH INDUSTRIAL AREA,M 24Ï¿½""), specifically (""). Instead, it stops while loading the data.

LOAD CSV FROM 'csvfile' WITH ENCODING encoding HAVING FIELDS ( tabledefn ) INTO loadurl TARGET TABLE schemaname.tablename WITH skip header = 1, fields terminated by ',', on error resume next, fields optionally enclosed by '"', batch rows = 200;

Sample Data 2,951855394,1786020,0,0,67,"09/17/2023 07:50:57","09/17/2023 08:50:57","10/16/2023 13:45:36",211,21,52.469799,-1.999604,13173,1 23284.15,0,""MUSAFFAH INDUSTRIAL AREA,M 24Ï¿½"","","B68 0LU","WEST MIDLANDS","OLDBURY","UK",1,0,0,13173,0,"09/17","0023355_WR11KZC","0","0",23 339,0,0,0,0,3000,25,24,10,10,425,299,30,6,0,0,"","UNITED KINGDOM",52,-2,"09/17/2023 07:50:57","09/17/2023 08:50:57",0,"","NCID" ,,,,0,"10/16/2023 14:17:11","GSL1-NAGM",,

Error while processing: ERROR non whitespace after quoted data #<CSV-READER LINE-IDX:18411 CHARACTER-LINE-IDX:148 CHARACTER -IDX:3822750 "" {100935F1E3}> M