datacleaner / DataCleaner

The premier open source Data Quality solution
GNU Lesser General Public License v3.0
591 stars 180 forks source link
data data-analysis data-science database datacleaner dataquality desktop etl mdm profiling

DataCleaner

Build Status: Linux Gitter chat

DataCleaner logo

The premier Open Source Data Quality solution.

DataCleaner is a Data Quality toolkit that allows you to profile, correct and enrich your data. People use it for ad-hoc analysis, recurring cleansing as well as a swiss-army knife in matching and Master Data Management solutions.

Where to go for end-user information?

Please visit the DataCleaner community website https://datacleaner.github.io for downloads, news, documentation etc.

Visit our Gitter chat channel https://gitter.im/datacleaner/community for asking questions or discussions.

GitHub markdown pages and issues are used for developers and technical aspects only.

Module structure

The main application modules are:

Code style and formatting

In the root of the project you can find 'Formatter-[IDE].xml' files which enable you to import the code formatting rules of the project into your IDE.

Continuous Integration

There's a public build of DataCleaner that can be found on Travis CI:

https://travis-ci.org/datacleaner/DataCleaner

License

Licensed under the Lesser General Public License, see http://www.gnu.org/licenses/lgpl.txt