eco-data-science / eco-data-science-old-site

website, wiki and issues for the group
http://eco-data-science.github.io
Other
31 stars 18 forks source link

Advantages of object oriented coding in R? #31

Open DanOvando opened 7 years ago

DanOvando commented 7 years ago

Hi gang,

A few folks code that I've been using makes use of object oriented coding in R via defining S4 classes for objects. e.g. you have an object defined as class fish called myfish that has slots myfish@common_name, myfish@max_length, etc.

It all looks very snazzy, but I'm a little unclear as to the relative benefits of this vs. say a list, where now I define a list called myfish with elements myfish$common_name, myfish$max_length, etc.

The list format is more in line with the standard way we program in R, and is more amenable to the tidyverse, so wondering with the main advantages of the object-oriented approach are?

Any thoughts?

grantmcdermott commented 7 years ago

@DanOvando Interesting question. I don't know for certain, but I'll hazard a guess...

The main advantages to defining your own classes -- particularly S4 objects -- are that they can/will impose arbitrarily strict functionality and checks on your objects. This can be useful, for example, if you require a special series of tests or statistical checks, or if you want your output to look a specific way. (Think tibble versus a normal dataframe... Or a coeftest object from the lmtest package, which is really just a matrix but formatted to nicely represent regression coefficients and their associated statistical parameters.)

Why would you need to do this for your hypothetical "myfish" class? It's hard to say without more information, but one possibility is that the authors wanted to make sure that these objects contained a full set of predefined slots, which in turn might be important for some kind of statistical or spatial analysis...

oharac commented 7 years ago

I haven't used object-oriented coding in R, but from what I recall from C++ and another object-oriented language I used long ago, it could be thought of as a specific data structure (like a list with pre-defined named elements), for which you can define functions to interact with that data in a specific way.

Some possible advantages, which seem more for other users than necessarily for yourself, unless you're going to use this data all over the place: