apache / doris

Apache Doris is an easy-to-use, high performance and unified analytics database.
https://doris.apache.org
Apache License 2.0
12.53k stars 3.24k forks source link

[Feature] Support hive meta cache TTL #17947

Open morningman opened 1 year ago

morningman commented 1 year ago

Search before asking

Description

Currently, if user modify the file on hdfs directly, no through hive. The changes of file will not be noticed by Doris and user will get wrong data.

One approach is to refresh the catalog at fixed rate. But this will have to refresh all catalog and invalidate all caches.

So I suggest to set TTL(Time-to-Live) config of File Cache, so that the stale file info will be invalidated automatically after expiring.

Use case

  1. create catalog with new property
    create catalog hive(
    ...,
    "file_meta_cache_ttl_second" = "60"
    );

And file cache of all tables in this catalog will be expired after 60 seconds. You can also set file_meta_cache_ttl_second to 0 to disable file cache.

Related issues

No response

Are you willing to submit PR?

Code of Conduct

lexluo09 commented 1 year ago

I'm interested in this feature. Could you assign it to me? Thank you

dutyu commented 1 year ago

I'm interested in this feature. Could you assign it to me? Thank you

Hello, wanna get to know is there any progress about this feature?

lexluo09 commented 1 year ago

Hi, I'm ready to submit it now.