HenrikBengtsson / R.oo

R package: R.oo - R Object-Oriented Programming with or without References
https://cran.r-project.org/package=R.oo
20 stars 1 forks source link

First constructed object bug #14

Open shay-zakov opened 8 years ago

shay-zakov commented 8 years ago

Hi,

I believe I've found a bug in R.oo. The attached script constructs an object b0 of class ClassB (extending ClassA), and then prints its data field. For some reason, the value of this field is not properly set. Then, it constructs an independent object b1. Now, somehow an argument given to the constructor of b1 is assigned to both fields in b0 and in b1. There are other weird things going on - see comments in the code. The code was tested using R 3.2.3 and R.oo 1.20.0.

Best,

Shay Zakov

Megic.txt

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

HenrikBengtsson commented 8 years ago

Without diving into all details what's going on, I'd say that the number thing that causes confusion here is that a method named get<Name>() defines a "virtual" field <name>. Thus, with a getFoo() method defined, you can as an alternative to calling getFoo(x) also call x$foo without field foo actually having to exists. To clarify, if you do x$foo on a virtual field, then method getFoo(x) is called. There's an analogue assignment rule, i.e. x$foo <- value will call setFoo(x, value) if it exists. Especially the latter can be very handy since you can validate value to prevent, say, negative numbers to be assigned.

In your case you have a true internal field/variable data and then you define getData() with in turn defined a virtual field data. This probably explains the odd behavior.

If you rename you fields/variables to, say, have a period in front you avoid this conflict, e.g. this$.data (and this$.phrase just in case). I've attached an adjusted script. Output is something like:

> message("*** b0:")
*** b0:

> b0 <- ClassB(phrase = "Oh, there it is")

> print(b0$.phrase)
[1] "Oh, there it is"

> print(b0$.data)
[1] "!"

> print(getData(b0))
getData() of ClassB called
[1] "!"

> print(b0$data)
getData() of ClassB called
[1] "!"

> # [1] "Now, where did I put it?!"
>
> message("*** b1:")
*** b1:

> b1 <- ClassB(phrase = "Oh, there it is")

> print(b1$.phrase)
[1] "Oh, there it is"

> print(b1$.data)
[1] "Oh, there it is!"

> print(b0$.phrase)
[1] "Oh, there it is"

> print(b0$.data)
[1] "Oh, there it is!"

> print(getData(b0))
getData() of ClassB called
[1] "Oh, there it is!"

> print(b0$data)
getData() of ClassB called
[1] "Oh, there it is!"

Hope this clarifies it.

Megic2.R.txt

HenrikBengtsson commented 8 years ago

I missed the following:

Yes, the first time ever the constructor is called a static instance of the object is created. Unfortunately, S3 does not have the concept of constructors and formal class definitions so we cannot really create this automatically prior to this. The programmer can for it manually using getStaticInstance(<Constructor>), which is effectively the same as <Contructor>().

So, yes, you are correct. The field data (or .data in my adjusted example) because a static field of the static class instance. This is because no such is created when the object is created (by the constructor function) and it the concludes this will be a static field instead. A static field is shared by all instances of that class. (One and a half decade after designing R.oo::Object, I'd say that I haven't used static fields that much and we could probably live without them).

How to avoid .data becoming a static field? Just "define" it to be NULL as in:

    this <- extend(Object(), "ClassA", .phrase = phrase, .data = NULL)

I've attached two scripts:

A minor comment: You rarely need to call return(x) in R; R will return the last value, so just use x. People using R will be more used to that.

shay-zakov commented 8 years ago

Hi Henrik,

Thanks for the quick response, that was really helpful!

So now I realize all non-static fields of a class must be specified in its constructor, within the extend expression. Is that right? If so, it might worth stressing it in the documentation (unless it is already there and I've missed it). What I still fail to understand is why the first time a ClassB instance is constructed in our example, the static data field is not assigned with the argument phrase, while in the second time it does. Of course, we can hack this issue as you suggested by generating the static object prior to the first "real" object construction (that was in fact more or less what I did in my own code when encountering this issue), but that is really counter intuitive (I guess very few people would suspect this can affect the functionality of their code). If this is the best/only workaround, I think this should be suggested as a best practice in the package examples (i.e. executing a getStaticInstance call at the end of each class definition).

But I still think something more essential is wrong. First, if you take my example and construct b0 and b1 with the ClassA constructor instead of ClassB, you get the desired behavior. Why is this case different? ClassB just delegates all implementation to ClassA, including the constructor, and you would expect both variants of the code to work the same. Second, even in your fix in Magic2b, try running it first as written, and then run it again after resetting the environment and commenting out line 21 (this$.data <- paste0(this$.data, "!")). You would expect the only difference between the two runs would be an additional "!" in some prints. Nevertheless, before commenting this line .data gets the <phrase> + ! assignment, while after commenting the line it is NULL.

As for your last comment, I'm aware of that return can be made implicit. I admit I'm a little pedantic, but I personally prefer adding it explicitly for readability concerns - no reason tormenting non-native R programmers like me (: