datafuselabs / databend

๐——๐—ฎ๐˜๐—ฎ, ๐—”๐—ป๐—ฎ๐—น๐˜†๐˜๐—ถ๐—ฐ๐˜€ & ๐—”๐—œ. Modern alternative to Snowflake. Cost-effective and simple for massive-scale analytics. https://databend.com
https://docs.databend.com
Other
7.31k stars 704 forks source link

chore: change load_file_metadata_expire_hours default from 24*7 to 12 hours #15514

Closed BohuTANG closed 2 weeks ago

BohuTANG commented 2 weeks ago

I hereby agree to the terms of the CLA available at: https://docs.databend.com/dev/policies/cla/

Summary

Tests

Type of change


This change isโ€‚Reviewable

what-the-diff[bot] commented 2 weeks ago

PR Summary

sundy-li commented 2 weeks ago

docs need to be updated

COPY INTO ensures idempotence by automatically tracking and preventing the reloading of files for a default period of 7 days. This can be customized using the load_file_metadata_expire_hours setting to control the expiration time for file metadata.
This parameter defaults to False meaning COPY INTO will skip duplicate files when copying data. If True, duplicate files will not be skipped.
BohuTANG commented 2 weeks ago

docs need to be updated

COPY INTO ensures idempotence by automatically tracking and preventing the reloading of files for a default period of 7 days. This can be customized using the load_file_metadata_expire_hours setting to control the expiration time for file metadata.
This parameter defaults to False meaning COPY INTO will skip duplicate files when copying data. If True, duplicate files will not be skipped.

~Will update after this PR merged.~ PR: https://github.com/datafuselabs/databend-docs/pull/788

TCeason commented 2 weeks ago
("load_file_metadata_expire_hours", DefaultSettingValue {
                    value: UserSettingValue::UInt64(24),
                    desc: "Sets the hours that the metadata of files you load data from with COPY INTO will expire in.",
                    mode: SettingMode::Both,
                    range: Some(SettingRange::Numeric(0..=u64::MAX)),
                }),

The value means hours. So the range's max use u64::MAX that may not be very reasonable