datavane / datavines

Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
https://datavane.github.io/datavines-website/
Apache License 2.0
430 stars 143 forks source link

[Bug] [datavines-connector] Hive数据源抓取数据库跨库问题 #350

Closed 954031894 closed 7 months ago

954031894 commented 7 months ago

Search before asking

What happened

通过Web UI添加Hive数据源时,填写某一个数据库后,抓取的是Hive全部表的信息 查看了一下这个问题pr,有相似的问题,但是解决的是sqlserver和pg Hive Version:2.3.7

DataVines Version

dev

DataVines Config

Running Command

Error Exception

Engine Type

No response

Java Version

jdk1.8

Screenshots

这个采集的是hive得dim层数据,结果拉取了全部数据库的信息

iShot_2024-02-01_10 14 03

Are you willing to submit PR?

xxzuo commented 7 months ago

这里 在代码中 hive 没有重写 getMetadataTables方法

   protected ResultSet getMetadataTables(DatabaseMetaData metaData, String catalog, String schema) throws SQLException {
        return metaData.getTables(catalog, schema, null, TABLE_TYPES);
    }

所以 相当于

   protected ResultSet getMetadataTables(DatabaseMetaData metaData, String catalog, String schema) throws SQLException {
        return metaData.getTables('database', null, null, TABLE_TYPES);
    }

对hive来说,这样相当于没有指定 database. 所以需要在 hive connector里重写一下

954031894 commented 7 months ago

嗯嗯,知道了。我这边试了一下没有问题