boxuancui / DataExplorer

Automate Data Exploration and Treatment
http://boxuancui.github.io/DataExplorer/
Other
512 stars 88 forks source link

changing plot_missing threshold classification #98

Closed prvst closed 5 years ago

prvst commented 5 years ago

Is it possible to change the plot_missing threshold classification ? I work with data sets which 2% missing is already bad, but when I plot them using plot_missing it says that they are good. The plots are not for me, so this might confuse some people.

boxuancui commented 5 years ago

As a temporary solution, you can hack profile_missing and plot_missing function by tweaking the threshold.

prvst commented 5 years ago

Thanks nice job with this package btw.

boxuancui commented 5 years ago

@prvst I still plan to put something in place for end users to customize the threshold. However, I do not have a good idea at this moment. I am re-opening, so that it doesn't disappear off the radar.

prvst commented 5 years ago

cool, thanks for the quick reply

boxuancui commented 5 years ago

@prvst With the latest develop branch, you can customize the band.

Example:

plot_missing(airquality, group = list("B1" = 0, "B2" = 0.06, "B3" = 1))