mwelz / GenericML

R implementation of Generic Machine Learning Inference (Chernozhukov, Demirer, Duflo and Fernández-Val, 2020).
GNU General Public License v3.0
64 stars 14 forks source link

Clear separation of computation and printing/plotting #21

Closed aalfons closed 2 years ago

aalfons commented 2 years ago

Functions get_BLP(), get_GATES(), and get_CLAN() have an argument plot, which produces a plot and by default is TRUE.

In R, it is frowned upon to compute results and produce a plot at the same time. Computing results should always be separated from printing/plotting the results. We will not be able to publish in JSS with the current implementation, and likely there will be comments at this at useR as well. So ideally, we should fix this before useR.

There are two ways to fix this, which I'll explain in separate replies to this issue.

aalfons commented 2 years ago

Option 1

Remove argument plot from functions get_BLP(), get_GATES(), and get_CLAN(), and write separate functions plot_BLP(), plot_GATES(), and plot_CLAN(). Those should be generic functions. For example, plot_BLP() should have one method for objects of class "GenericML", and the default method should expect a matrix as returned by get_BLP().

This is the quicker and easier fix.

aalfons commented 2 years ago

Option 2

I'm going to use get_BLP() as an example, the other functions can be fixed accordingly.

Instead of returning a matrix, get_BLP() should return an object of class "BLP". This object consists of the following components:

  1. Estimate: vector containing the point estimates.
  2. Confidence interval: matrix with columns "lower" and "upper" for the lower and upper confidence bounds, respectively.
  3. p-Value: vector containing the p-values.

If argument plot = TRUE, an additional component plot is added, that contains the "ggplot" object of the plot.

The print() method for objects of class "BLP" then does the following:

  1. Combine the estimates, confidence intervals, and p-values into a matrix, and prints them using function printCoefmat() such that the significance stars are displayed as well.
  2. If a plot is stored in the object, it prints the plot to the graphics device.

This way, extracting (computing) the BLP's is clearly separated from printing/plotting them. For instance, if one assigns the BLP's to an object, the plot will not be created. The plot will only be shown in the graphics device when the object is printed to the console. Furthermore, we'd get a nicer way to print the results since we can use printCoefmat(). It will also allow users to take the "ggplot" object and customize it in some (limited) ways, for example, by adding different axis labels, flipping the axes, and so on.

aalfons commented 2 years ago

I'm in favor of option 2, since it's cleaner from on object-oriented programming point of view. I think this cleaner implementation will also have advantages for writing the summary() method, which should likely produce output for BLP, GATES, and CLAN.

mwelz commented 2 years ago

Thank you for raising this issue, I agree that this should be addressed soon with Option 2. As a minor deviation from your suggestions, get_BLP() would return an object of class get_BLP, because the class name BLP is already used for objects returned by BLP(). The same holds analogously for the class names GATES and CLAN.

aalfons commented 2 years ago

I'm not sure about "get_BLP" as the name for the class, it would be weird to have "get" in a class name.

aalfons commented 2 years ago

If this is the same output that we want to show in the summary() method for "GenericML" objects, we could call the classes "summary_BLP", etc.

mwelz commented 2 years ago

Thanks for the suggestion. Then let's go for "summary_BLP" etc. since it would be inconsistent to have the function BLP() return an object whose class is not named "BLP".

aalfons commented 2 years ago

Yes. Since the summary() method will likely call get_BLP(), get_GATES(), and get_CLAN() to show the results of BLP, GATES, and CLAN in the same format, I think "summary_BLP" (and so on) makes sense.