carloscinelli / benford.analysis

Tools that make it easier to use Benford’s law for data validation and forensic analytics.
61 stars 15 forks source link

Improve default plot order and legend placement #32

Open carloscinelli opened 5 years ago

carloscinelli commented 5 years ago

It might not be clear to users to what the chi-squared difference refers to. Maybe put both next to each other, and improve the description of the plot.

Also, think about a better default legend placement.

carloscinelli commented 5 years ago

@rafaelslins I think the changes look good!

Some thoughts:

carloscinelli commented 5 years ago

@rafaelslins some problems in the current implementation

carloscinelli commented 5 years ago

I'm inclined to think this looks cleaner for the bounds

image

Versus the current version

image

carloscinelli commented 5 years ago

For the legend, we could aim for something like this: https://stackoverflow.com/questions/10389967/common-legend-for-multiple-plots-in-r

carloscinelli commented 5 years ago

@rafaelslins Rafael, some bugs and other problems were introduced in this PR. I'm going to list some of them.

Other comments:

Let's do these changes in a branch, and only merge when things are well tested and working correctly.

carloscinelli commented 5 years ago

Argument multiple = F still not working,

Example: data("census.2009") cs <- benford(census.2009$pop.2009[census.2009$pop.2009>10]) plot(cs, select = "digits", multiple = F)

rafaelslins commented 5 years ago

I think part of the problem with multiple = F is due to the incompatibility of thelayout() function (mainly used to allocate subtitles) and par() function.

?layout: "These functions are totally incompatible with the other mechanisms for arranging plots on a device: par(mfrow), par(mfcol) and split.screen."

rafaelslins commented 5 years ago

I'm thinking of a solution using just the pair () function

rafaelslins commented 5 years ago

I'm trying to make it possible:

par(mfrow=c(2,1)) bfd.cp <- benford(corporate.payment$Amount) plot(bfd.cp, select = "digits", multiple = F) plot(bfd.cp, select = "chi squared", multiple = F)

rafaelslins commented 5 years ago

I have had a lot of fails to try it work:

par(mfrow=c(2,1)) plot(bfd.cp, select="digits", multiple=F) #plots plot(bfd.cp, select="rootogram digits", multiple=F) #plots

I implement a simple (temporary) solution that returns the desired result:

plot(bfd.cp, select=c("digits", "rootogram digits"), multiple=T, mfrow = c(2,1))

carloscinelli commented 5 years ago

@rafaelslins you can plot the legends of single plots inside the plot itself, then this should not be a problem.

But for now focus on having all individual plot functions implemented and working correctly as autonomous individual functions that are easy to use and customize. These functions should be easy to use by themselves without resorting to the generic plot.