HoloClean / holoclean

A Machine Learning System for Data Enrichment.
http://www.holoclean.io
Apache License 2.0
518 stars 130 forks source link

pos_values df in memory #102

Open fatangare opened 5 years ago

fatangare commented 5 years ago

Hi,

pos_values are used in get_infer_dataframes() method in repair.py and it takes values using SQL query and not from DF. Since the size of pos_values is large, we can get rid of memory used by DF as DF is not at all used but kept in memory.

I can add patch and send pull request, if is fine with you.