awslabs / deequ

Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Apache License 2.0
3.29k stars 538 forks source link

Are you interested in integrating deequ with DataSphere Studio? #184

Closed wushengyeyouya closed 4 years ago

wushengyeyouya commented 4 years ago

Deequ is a very great project in the field of data quality, and I think a good way to enhance the ability of data development is to integrate deequ with DataSphere Studio. What is DataSphere Studio? DataSphere Studio is a one-stop data application development and management portal open-sourced by WeBank. It meets the requirements of the entire process of data application development from data exchange, desensitization and cleaning, analysis and mining, quality inspection, visual display, regular scheduling to data output. Github address: https://github.com/WeBankFinTech/DataSphereStudio Are you interested?

sscdotopen commented 4 years ago

Hi, Thank you for your interest. Our goal is to maintain deequ as a lean library with as few dependencies as possible (mostly Apache Spark). We would be happy to help with integrating deequ, but we would be reluctant to take an external dependency on a visualisation framework.