hansenlab / minfi

Devel repository for minfi
58 stars 70 forks source link

Make functions into formal S4 generics and methods? #143

Closed PeteHaitch closed 6 years ago

PeteHaitch commented 6 years ago

E.g., preprocessRaw() is a plain function. Any advantage to this?

kasperdanielhansen commented 6 years ago

No.

While S4 classes are amazing, methods are much less useful. What you get out of having a method is dispatching, but you make it much harder to debug and follow the logic. My rule of thumb is never make a function into a generic unless you have at least 3 classes on which you dispatch.

PeteHaitch commented 6 years ago

Agreed, preprocessRaw() was a bad example. Unsure exactly why I made this issue, but I'll use it to highlight how I made use of some internal S4 generics/methods to support matrix and DelayedMatrix inputs. I'll illustrate the basic recipe with preprocessSwan():

  1. Define internal generic

https://github.com/hansenlab/minfi/blob/f742e9100f63a3f984ce683f93a1fcc72d64edd6/R/preprocessSwan.R#L103-L106

  1. Top-level function that extracts bits and pieces from RGSet/MSet/etc. and then passes these to internal method based on whether bits and pieces are matrix or DelayedMatrix. Finally, assembles the returned object.

https://github.com/hansenlab/minfi/blob/f742e9100f63a3f984ce683f93a1fcc72d64edd6/R/preprocessSwan.R#L183-L233

2a. Internal S4 method for matrix input(s)

https://github.com/hansenlab/minfi/blob/f742e9100f63a3f984ce683f93a1fcc72d64edd6/R/preprocessSwan.R#L111-L127

2b. Internal S4 method for DelayedMatrix input(s). It constucts RealizationSink instances (if necessary) and then uses blockApply()/blockMapply()/blockApplyWithRealization()/blockMapplyWithRealization() to apply the method from (2a) to matrix elements(s) extracted from the DelayedMatrix input(s).

https://github.com/hansenlab/minfi/blob/f742e9100f63a3f984ce683f93a1fcc72d64edd6/R/preprocessSwan.R#L129-L179

kasperdanielhansen commented 6 years ago

Yes, I have seen this.

My POV on this aspect is that it gives the code a lot of internal beauty in this case, especially when you write it. All this dispatching becomes harder to grok when you look at it 6 months later (but ok, this example is not bad). Worse, it makes debugging using debug() much harder and profiling harder to read. You could get the same effect by

preprocessSWAN = function(x) {
   if(TEST FOR MATRIX)
      do matrix code
   if(TEST FOR DELAYED MATRIX)
      do DelayedMatrix code
}

This is much uglier, but IMO more functional. It took me a long time to reach this conclusion ... S4 methods has a certain beauty to them,