HoloClean / holoclean

A Machine Learning System for Data Enrichment.
http://www.holoclean.io
Apache License 2.0
514 stars 129 forks source link

Fusion: initialize init_value as majority and domain as values across sources #21

Closed richardwu closed 5 years ago

richardwu commented 5 years ago

Note: Please review only the last commit until #18 is merged.

This PR initializes the init_value in cell_domain for fusion datasets (where there exist more than one row for a given entity) as the majority value across sources and constructs the domain in cell_domain as values across all sources (for a given entity-attribute).