HoloClean / holoclean

A Machine Learning System for Data Enrichment.
http://www.holoclean.io
Apache License 2.0
514 stars 129 forks source link

Confused active attributes returned if not running detect_errors before generate domain #91

Closed zaqthss closed 4 years ago

zaqthss commented 5 years ago

If hc.setup_domain() runs without hc.detect_errors(detectors) in advance, the active_attributes as well as the dk cells for domain generation will be read from the postgres database (AuxTables.dk_cells.name) directly, which may cause the error of no field of name XXX.

We should explicitly show that when using holoclean for a dataset, we must either run the detect errors or load the dk cells from a specific relation

minafarid commented 5 years ago

An alternative is to try to read active_attributes and fall back to all attributes in the dataset if no dk_cells table is available.