scattermatrix
convert categorical to binary numeric feature (indicator/dummy variables) repeated values can cause bias as they have overstated weight missing value: remove row, substitute specific value, interpolate, fwd/bwd fill, impute R !duplicated() visualizing outliers with scatterplot matrix, pairs plot treatment of ouliers: censor, trim, interpolate, substitute scaling of numeric variables, treat outliers before scaling, e.g. z value scaling
http://datascience.codata.org/articles/10.5334/dsj-2015-002/
https://dzone.com/articles/how-to-rock-data-quality-checks-in-the-data-lake
http://bubbles.databrewery.org/
http://www.stiivi.com/about.html
http://www.bigdataeverywhere.com/files/denver/BDE_Data_Governance_KAMREDDY.pdf
scattermatrix convert categorical to binary numeric feature (indicator/dummy variables) repeated values can cause bias as they have overstated weight missing value: remove row, substitute specific value, interpolate, fwd/bwd fill, impute R !duplicated() visualizing outliers with scatterplot matrix, pairs plot treatment of ouliers: censor, trim, interpolate, substitute scaling of numeric variables, treat outliers before scaling, e.g. z value scaling