HoloClean / holoclean

A Machine Learning System for Data Enrichment.
http://www.holoclean.io
Apache License 2.0
514 stars 129 forks source link

Created separate column for init values (1 or more) and current value (singular value, old 'init_value') #30

Closed richardwu closed 5 years ago

richardwu commented 5 years ago

This will allow us to specify multiple initial values for a given cell that we could use in a special MultiInitFeaturizer or MultiOccurFeaturizer.

The old init_value is now current_value (all featurizers have been changed to reference current_value and renamed from e.g. InitFeaturizer to CurrentFeaturizer).

Also fixed a bug in InitSimFeaturizer where it wasn't computing the similarity metrics correctly between the init_value and values in the domain.

richardwu commented 5 years ago

Closing this to combine with #32.