Right now when integrating sling into our orchestration there is a danger of race condition and data corruption happening.
Because the directory name for the temporary files contains only the process time.
There is a chance that the directory name is the same across multiple streams/invocations of sling and all write into the same file.
Maybe adding a random string to each directory would be an option to make sure everything stays separate.
Here is a example of the same file being uploaded to Snowflake into three different tables, because all three streams had the same directory name.
PUT 'file:///tmp/snowflake/put/2024-10-14T070736.758/part.01.0001.csv.zst' @CE_STG.sling_staging/"CE_STG"."Z_MIK_INVOICES_VEHICLES_HR_TMP"/2024-10-14T070736.758 PARALLEL=8 AUTO_COMPRESS=FALSE
PUT 'file:///tmp/snowflake/put/2024-10-14T070736.758/part.01.0001.csv.zst' @CE_STG.sling_staging/"CE_STG"."Z_MIK_VEHICLE_REGISTERED_RS_TMP"/2024-10-14T070736.759 PARALLEL=8 AUTO_COMPRESS=FALSE
PUT 'file:///tmp/snowflake/put/2024-10-14T070736.758/part.01.0001.csv.zst' @CE_STG.sling_staging/"CE_STG"."Z_MIK_VEHICLE_REGISTERED_ACA_TMP"/2024-10-14T070736.758 PARALLEL=8 AUTO_COMPRESS=FALSE
Feature Description
Right now when integrating sling into our orchestration there is a danger of race condition and data corruption happening. Because the directory name for the temporary files contains only the process time. There is a chance that the directory name is the same across multiple streams/invocations of sling and all write into the same file.
Maybe adding a random string to each directory would be an option to make sure everything stays separate.
Here is a example of the same file being uploaded to Snowflake into three different tables, because all three streams had the same directory name.
PUT 'file:///tmp/snowflake/put/2024-10-14T070736.758/part.01.0001.csv.zst' @CE_STG.sling_staging/"CE_STG"."Z_MIK_INVOICES_VEHICLES_HR_TMP"/2024-10-14T070736.758 PARALLEL=8 AUTO_COMPRESS=FALSE PUT 'file:///tmp/snowflake/put/2024-10-14T070736.758/part.01.0001.csv.zst' @CE_STG.sling_staging/"CE_STG"."Z_MIK_VEHICLE_REGISTERED_RS_TMP"/2024-10-14T070736.759 PARALLEL=8 AUTO_COMPRESS=FALSE PUT 'file:///tmp/snowflake/put/2024-10-14T070736.758/part.01.0001.csv.zst' @CE_STG.sling_staging/"CE_STG"."Z_MIK_VEHICLE_REGISTERED_ACA_TMP"/2024-10-14T070736.758 PARALLEL=8 AUTO_COMPRESS=FALSE