HoloClean / holoclean

A Machine Learning System for Data Enrichment.
http://www.holoclean.io
Apache License 2.0
518 stars 130 forks source link

Example that does not use postgres #95

Closed m1sta closed 5 years ago

m1sta commented 5 years ago

Hi,

Would it be possible for someone to write a simple example that uses HoloClean with a basic csv, parquet file, or sqlite database instead of postgres? A HoloClean example without the postgres dependency, and even better than initially just uses pytorch-cpu, would make HoloClean feel accessible to a much larger user group.

minafarid commented 5 years ago

Hello and thank you for your interest,

HoloClean accepts CSV files as input. But we rely on a DBMS (postgres in this case) to store and query the data and perform various analytics that cannot be done on plain CSV or parquet files, which only store data. I would suggest using a docker for postgres if you are not familiar with its setup. Hopefully, this makes it easier for you to try HoloClean out.