slingdata-io / sling-cli

Sling is a CLI tool that extracts data from a source storage/database and loads it in a target storage/database.
https://docs.slingdata.io
GNU General Public License v3.0
299 stars 16 forks source link

File destination - have a skip option for same file name instead of replace/delete #282

Closed dduong1603 closed 1 month ago

dduong1603 commented 2 months ago

Right now Sling deletes the file and replace it with a new file. This changes the update time of the file and potentially impacts downstream workflow. Can we get a skip/ignore-existing option instead?

flarco commented 2 months ago

Interesting. So, basically, if the destination file/object exists, do nothing?

flarco commented 2 months ago

In your situation, is the stream incremental? is the source a database or file?

dduong1603 commented 2 months ago

The stream is not incremental, and the source is a file (trying to do a file to file transfer), and we want to write that same file to multiple destinations, and if something fails midway, we want to retry the process but skip the destinations that already have that file. We could probably hack together something in our process to ignore the CLI call for those particular destinations, but thought it would be nice to have a flag in the CLI call or as part of the tgt-options 😄

dduong1603 commented 2 months ago

I guess this also applies to multiple streams within a single destination/replication, and we want to skip the file/stream that has the same destination name since they are all within the same CLI call

flarco commented 2 months ago

Cool, will add a ignore_existing key in target-options.

dduong1603 commented 1 month ago

any update on this @flarco ?

flarco commented 1 month ago

Should be completed this week for upcoming release #289 (likely this weekend).

flarco commented 1 month ago

Done (https://github.com/slingdata-io/sling-cli/pull/289/commits/cc6b5d1a7586e9649bf036ba8245037a195d39f9). Feel free to compile binary and test.