datavane / datavines

Know your data better!Datavines is Next-gen Data Observability Platform, support metadata manage and data quality.
https://datavane.github.io/datavines-website/
Apache License 2.0
410 stars 139 forks source link
dataobservability dataprofile dataquality datascience doris metadata spark

Datavines

EN doc CN doc

Data quality is used to ensure the accuracy of data in the process of integration and processing. It is also the core component of DataOps. DataVines is an easy-to-use data quality service platform that supports multiple metric.

Architecture Design

DataVinesArchitecture

Install

Need: Maven 3.6.1 and later

$ mvn clean package -Prelease -DskipTests

Features

Data Catalog

Data Catalog

Data Quality

Data Quality

Data Profile

数据目录

Plug-in Design

The platform is based on plug-in design, and the following modules support user-defined plug-ins to expand

Multiple Execute Modes

作业脚本

Easy Deployment & High Availability

Environmental Dependency

  1. java runtime environment: jdk8
  2. If the data volume is small, or the goal is merely for functional verification, you can use JDBC engine
  3. If you want to run DataVines based on Spark, you need to ensure that your server has spark installed

    Quick Start

    Click Document for more information

Development

Click Document for more information

Contribution

PRs Welcome

You can submit any ideas as pull requests or as GitHub issues.

If you're new to posting issues, we ask that you read How To Ask Questions The Smart Way (This guide does not provide actual support services for this project!), How to Report Bugs Effectively prior to posting. Well written bug reports help us help you!

Thank you to all the people who already contributed to Datavines!

contrib graph

License

Datavines is licensed under the Apache License 2.0. Datavines relies on some third-party components, and their open source protocols are also Apache License 2.0 or compatible with Apache License 2.0. In addition, Datavines also directly references or modifies some codes in Apache DolphinScheduler, SeaTunnel and Dubbo, all of which are Apache License 2.0. Thanks for contributions to these projects.

Social Media

wechat-qrcode

Contact Author

wechat-author-qrcode

Donation

wechat-donation-qrcode