Open TimPossiblee opened 2 weeks ago
Description of the issue:
Loading multiple CSVs (Local to Snowflake) from a directory using single: true always produces an error because it tries to upload the same temporary file multiple times, while also deleting it.
single: true
Sling version (sling --version): 1.2.20
sling --version
Operating System (linux, mac, windows): linux
linux
mac
windows
Replication Configuration:
source: local target: snowflake env: SLING_STREAM_URL_COLUMN: true SAMPLE_SIZE: 0 defaults: mode: full-refresh single: true source_options: format: csv header: true # delimiter: '\t' columns: "*": string target_options: column_casing: target streams: "file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/*stock_vehicle_*.txt": object: CE_STG.PROG_STOCK_VEHICLE
-d
2024-11-08 07:49:10 DBG opened "file" connection (conn-file-4d5) 2024-11-08 07:49:10 INF Sling Replication [1 streams] | local -> snowflake 2024-11-08 07:49:10 INF [1 / 1] running stream file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/*stock_vehicle_*.txt 2024-11-08 07:49:10 DBG Sling version: 1.2.20 (linux amd64) 2024-11-08 07:49:10 DBG type is file-db 2024-11-08 07:49:10 DBG using: {"columns":[{"name":"*","type":"string"}],"mode":"full-refresh","transforms":null} 2024-11-08 07:49:10 DBG using source options: {"trim_space":false,"empty_as_null":true,"header":true,"fields_per_rec":-1,"compression":"auto","format":"csv","null_if":"NULL","datetime_format":"AUTO","skip_blank_lines":false,"max_decimals":-1} 2024-11-08 07:49:10 DBG using target options: {"datetime_format":"auto","file_max_rows":0,"file_max_bytes":0,"max_decimals":-1,"use_bulk":true,"add_new_columns":true,"adjust_column_type":false,"column_casing":"target"} 2024-11-08 07:49:10 DBG opened "snowflake" connection (conn-snowflake-QsU) 2024-11-08 07:49:10 INF connecting to target database (snowflake) 2024-11-08 07:49:10 INF reading from source file system (file) 2024-11-08 07:49:10 DBG opened "file" connection (conn-file-EuF) 2024-11-08 07:49:10 DBG reading single datastream from file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/*stock_vehicle_*.txt [format=csv] 2024-11-08 07:49:10 DBG merging csv readers of 6 files [concurrency=3] from file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/*stock_vehicle_*.txt 2024-11-08 07:49:10 DBG processing reader from file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/stock_vehicle_20241107043759.txt 2024-11-08 07:49:10 DBG processing reader from file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/CAI/stock_vehicle_20241107043759.txt 2024-11-08 07:49:10 DBG processing reader from file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/EFI/stock_vehicle_20241107043759.txt 2024-11-08 07:49:10 DBG processing reader from file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/MMI/stock_vehicle_20241107043759.txt 2024-11-08 07:49:10 DBG processing reader from file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/FAI/stock_vehicle_20241107043759.txt 2024-11-08 07:49:10 DBG processing reader from file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/PAI/stock_vehicle_20241107043759.txt 2024-11-08 07:49:10 DBG delimiter auto-detected: "\t" 2024-11-08 07:49:10 DBG casting column 'monthdayyear' as 'string' 2024-11-08 07:49:10 DBG casting column 'company' as 'string' 2024-11-08 07:49:10 DBG casting column 'company_name' as 'string' 2024-11-08 07:49:10 DBG casting column 'modelcode' as 'string' 2024-11-08 07:49:10 DBG casting column 'vehicletype' as 'string' 2024-11-08 07:49:10 DBG casting column 'vehiclegroup' as 'string' 2024-11-08 07:49:10 DBG casting column 'currency' as 'string' 2024-11-08 07:49:10 DBG casting column 'stockunits' as 'string' 2024-11-08 07:49:10 DBG casting column 'stockvalue' as 'string' 2024-11-08 07:49:10 DBG casting column 'openorders' as 'string' 2024-11-08 07:49:10 DBG casting column 'opencontract' as 'string' 2024-11-08 07:49:10 DBG casting column 'purchasedate' as 'string' 2024-11-08 07:49:10 DBG casting column 'demostock' as 'string' 2024-11-08 07:49:10 DBG casting column 'commissionnumber' as 'string' 2024-11-08 07:49:10 DBG casting column 'chassisnumber' as 'string' 2024-11-08 07:49:10 DBG casting column 'description' as 'string' 2024-11-08 07:49:10 DBG casting column 'brand' as 'string' 2024-11-08 07:49:10 DBG casting column 'fuel_type' as 'string' 2024-11-08 07:49:10 DBG casting column 'col' as 'string' 2024-11-08 07:49:10 INF writing to target database [mode: full-refresh] 2024-11-08 07:49:10 DBG drop table if exists "CE_STG"."PROG_STOCK_VEHICLE_TMP" 2024-11-08 07:49:10 DBG table "CE_STG"."PROG_STOCK_VEHICLE_TMP" dropped 2024-11-08 07:49:10 DBG create transient table "CE_STG"."PROG_STOCK_VEHICLE_TMP" ("MONTHDAYYEAR" varchar, "COMPANY" varchar, "COMPANY_NAME" varchar, "MODELCODE" varchar, "VEHICLETYPE" varchar, "VEHICLEGROUP" varchar, "CURRENCY" varchar, "STOCKUNITS" varchar, "STOCKVALUE" varchar, "OPENORDERS" varchar, "OPENCONTRACT" varchar, "PURCHASEDATE" varchar, "DEMOSTOCK" varchar, "COMMISSIONNUMBER" varchar, "CHASSISNUMBER" varchar, "DESCRIPTION" varchar, "BRAND" varchar, "FUEL_TYPE" varchar, "COL" varchar, "_SLING_LOADED_AT" timestamp_tz, "_SLING_STREAM_URL" varchar) 2024-11-08 07:49:11 INF streaming data 2024-11-08 07:49:11 DBG USE SCHEMA CE_STG 2024-11-08 07:49:11 DBG opened "file" connection (conn-file-jAJ) 2024-11-08 07:49:11 DBG writing to file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178 [fileRowLimit=0 fileBytesLimit=0 compression=zstd concurrency=7 useBufferedStream=false fileFormat=csv singleFile=true] 2024-11-08 07:49:11 DBG REMOVE @CE_STG.sling_staging/"CE_STG"."PROG_STOCK_VEHICLE_TMP"/2024-11-08T074911.179 2024-11-08 07:49:11 DBG adding new column: custorderdate 2024-11-08 07:49:11 DBG alter table "CE_STG"."PROG_STOCK_VEHICLE_TMP" add column "CUSTORDERDATE" varchar 2024-11-08 07:49:12 DBG adding new column: dealerorderdate 2024-11-08 07:49:12 DBG alter table "CE_STG"."PROG_STOCK_VEHICLE_TMP" add column "DEALERORDERDATE" varchar 2024-11-08 07:49:12 DBG adding new column: dealerstock 2024-11-08 07:49:12 DBG alter table "CE_STG"."PROG_STOCK_VEHICLE_TMP" add column "DEALERSTOCK" varchar 2024-11-08 07:49:12 DBG PUT 'file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst' @CE_STG.sling_staging/"CE_STG"."PROG_STOCK_VEHICLE_TMP"/2024-11-08T074911.179 PARALLEL=8 AUTO_COMPRESS=FALSE 2024-11-08 07:49:12 DBG PUT 'file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst' @CE_STG.sling_staging/"CE_STG"."PROG_STOCK_VEHICLE_TMP"/2024-11-08T074911.179 PARALLEL=8 AUTO_COMPRESS=FALSE 2024-11-08 07:49:12 DBG REMOVE @CE_STG.sling_staging/"CE_STG"."PROG_STOCK_VEHICLE_TMP"/2024-11-08T074911.179 2024-11-08 07:49:12 DBG REMOVE @CE_STG.sling_staging/"CE_STG"."PROG_STOCK_VEHICLE_TMP"/2024-11-08T074911.179 2024-11-08 07:49:12 DBG drop table if exists "CE_STG"."PROG_STOCK_VEHICLE_TMP" 2024-11-08 07:49:13 DBG table "CE_STG"."PROG_STOCK_VEHICLE_TMP" dropped 2024-11-08 07:49:13 DBG closed "snowflake" connection (conn-snowflake-QsU) 2024-11-08 07:49:13 INF execution failed --- database_snowflake.go:792 func5 --- --- database_snowflake.go:780 func4 --- ~ Error copying to Snowflake Stage: sling_staging --- database_snowflake.go:931 StagePUT --- ~ could not PUT file file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst --- database.go:1166 Query --- ~ Error with StreamRows --- database.go:1179 QueryContext --- ~ SQL Error for: PUT 'file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst' @CE_STG.sling_staging/"CE_STG"."PROG_STOCK_VEHICLE_TMP"/2024-11-08T074911.179 PARALLEL=8 AUTO_COMPRESS=FALSE --- database.go:814 StreamRowsContext --- ~ Error executing query --- transaction.go:115 QueryContext --- 264006: file does not exist: [/home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst] context canceled --- task_run.go:97 func1 --- --- task_run.go:403 runFileToDB --- --- task_run_write.go:331 WriteToDb --- ~ could not write to database --- database_snowflake.go:459 BulkImportFlow --- ~ could not insert into "CE_STG"."PROG_STOCK_VEHICLE_TMP". --- database_snowflake.go:887 CopyViaStage --- --- database_snowflake.go:792 func5 --- --- database_snowflake.go:780 func4 --- ~ Error copying to Snowflake Stage: sling_staging --- database_snowflake.go:931 StagePUT --- ~ could not PUT file file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst --- database.go:1166 Query --- ~ Error with StreamRows --- database.go:1179 QueryContext --- ~ SQL Error for: PUT 'file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst' @CE_STG.sling_staging/"CE_STG"."PROG_STOCK_VEHICLE_TMP"/2024-11-08T074911.179 PARALLEL=8 AUTO_COMPRESS=FALSE --- database.go:814 StreamRowsContext --- ~ Error executing query --- transaction.go:115 QueryContext --- 264006: file does not exist: [/home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst] context canceled 2024-11-08 07:49:13 INF Sling Replication Completed in 2s | local -> snowflake | 0 Successes | 1 Failures fatal: --- proc.go:271 main --- --- sling_cli.go:458 main --- --- sling_cli.go:494 cliInit --- --- cli.go:286 CliProcess --- ~ failure running replication (see docs @ https://docs.slingdata.io/sling-cli) --- sling_run.go:202 processRun --- --------------------------- file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/progression/*stock_vehicle_*.txt --------------------------- --- proc.go:271 main --- --- sling_cli.go:458 main --- --- sling_cli.go:494 cliInit --- --- cli.go:286 CliProcess --- --- sling_run.go:200 processRun --- --- sling_run.go:494 runReplication --- --- sling_run.go:396 runTask --- --- task_run.go:138 Execute --- --- database_snowflake.go:792 func5 --- --- database_snowflake.go:780 func4 --- ~ Error copying to Snowflake Stage: sling_staging --- database_snowflake.go:931 StagePUT --- ~ could not PUT file file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst --- database.go:1166 Query --- ~ Error with StreamRows --- database.go:1179 QueryContext --- ~ SQL Error for: PUT 'file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst' @CE_STG.sling_staging/"CE_STG"."PROG_STOCK_VEHICLE_TMP"/2024-11-08T074911.179 PARALLEL=8 AUTO_COMPRESS=FALSE --- database.go:814 StreamRowsContext --- ~ Error executing query --- transaction.go:115 QueryContext --- 264006: file does not exist: [/home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst] --- task_run.go:97 func1 --- --- task_run.go:403 runFileToDB --- --- task_run_write.go:331 WriteToDb --- ~ could not write to database --- database_snowflake.go:459 BulkImportFlow --- ~ could not insert into "CE_STG"."PROG_STOCK_VEHICLE_TMP". --- database_snowflake.go:887 CopyViaStage --- --- database_snowflake.go:792 func5 --- --- database_snowflake.go:780 func4 --- ~ Error copying to Snowflake Stage: sling_staging --- database_snowflake.go:931 StagePUT --- ~ could not PUT file file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst --- database.go:1166 Query --- ~ Error with StreamRows --- database.go:1179 QueryContext --- ~ SQL Error for: PUT 'file:///home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst' @CE_STG.sling_staging/"CE_STG"."PROG_STOCK_VEHICLE_TMP"/2024-11-08T074911.179 PARALLEL=8 AUTO_COMPRESS=FALSE --- database.go:814 StreamRowsContext --- ~ Error executing query --- transaction.go:115 QueryContext --- 264006: file does not exist: [/home/tim/gitbox/eds-warehouse.git/v2/orchestrate/sling/tmp/c07fbe89/snowflake/put/2024-11-08T074911.178.zst]
Issue Description
Description of the issue:
Loading multiple CSVs (Local to Snowflake) from a directory using
single: true
always produces an error because it tries to upload the same temporary file multiple times, while also deleting it.Sling version (
sling --version
): 1.2.20Operating System (
linux
,mac
,windows
): linuxReplication Configuration:
-d
):