ggobi / plumbr

Mutable dynamic data structures for R
18 stars 2 forks source link

subsetting a mutaframe can trigger new listeners #5

Open yihui opened 12 years ago

yihui commented 12 years ago

I have been wondering for a long while why the number of listeners keeps on increasing when the grand tour is running (qtour() in cranvas), and today I figured out the reason: simply subsetting a mutaframe can add new listeners on it. Here is an example:

library(plumbr)
library(qtbase)
library(objectSignals)
mf_listeners = function(x) listeners(plumbr:::changed(x))

mf = mutaframe(x = rnorm(5), y = rnorm(5))
add_listener(mf, function(i, j) {
  timestamp()
  str(mf_listeners(mf)) # print str() of mf listeners
})

# use a timer to change mf$x every 2 seconds
tm = qtimer(2000, function() {
  mf$x = rnorm(5)
})
tm$start()

## there was only one listener as expected, but the line below will add a new one

mf[1, ] # this will add a listener on mf

Because all plotting functions use [] frequently, the number of listeners can increase rapidly in a tour, which makes the tour slower and slower.

Is this new listener expected? I see it comes from [.mutaframe which uses filter_proxy().

Thanks!

hadley commented 12 years ago

I think the new listener is expected because there the subset data frame needs some way to be notified if there are changes in the parent so that it can keep up to date. Maybe @lawremi has ideas?

yihui commented 12 years ago

I think we need an additional argument like drop.listeners = TRUE/FALSE when subsetting a mutaframe; there has to be a way to inhibit listeners on subsets.

lawremi commented 12 years ago

Sorry, I haven't had time to look into this, with the Bioc release. Are you saying that listeners on the original mutaframe are "copied over" to the new subset mutaframe?

On Fri, Mar 30, 2012 at 12:40 PM, Yihui Xie < reply@reply.github.com

wrote:

I think we need an additional argument like drop.listeners = TRUE/FALSE when subsetting a mutaframe; there has to be a way to inhibit listeners on subsets.


Reply to this email directly or view it on GitHub: https://github.com/ggobi/plumbr/issues/5#issuecomment-4849537

yihui commented 12 years ago

They are not copied over. It is just that a new listener is added on the original mutaframe. See this example:

library(plumbr)

mf = mutaframe(x = rnorm(5), y = rnorm(5))
add_listener(mf, function(i, j) {
  print(plumbr:::changed(mf))
})

mf[1, 1] = rnorm(1)

mf2 = mf[1, ] # this will add a listener on mf

mf[1, 1] = rnorm(1)

and here are the results:

> mf = mutaframe(x = rnorm(5), y = rnorm(5))
> add_listener(mf, function(i, j) {
+   print(plumbr:::changed(mf))
+ })
> 
> mf[1, 1] = rnorm(1)
Signal(i = , j = ) with 1 listeners
> 
> mf2 = mf[1, ] # this will add a listener on mf
> 
> mf[1, 1] = rnorm(1)
Signal(i = , j = ) with 2 listeners

After I subset mf to create mf2, there is an additional listener on mf:

> objectSignals::listeners(plumbr:::changed(mf))
$`1`
function (i, j, ...) 
function (i, j) 
{
    print(plumbr:::changed(mf))
}(i = i, j = j)(i, j)

$`2`
function (event_i, event_j, ...) 
function (event_i, event_j) 
{
    if (shape_changed(event_i, event_j)) {
        notify_listeners(child, NULL, NULL)
    }
    else {
        new_i <- match(event_i, i)
        incl <- !is.na(new_i)
        notify_listeners(child, new_i[incl], event_j[incl])
    }
}(event_i = event_i, event_j = event_j)(i, j)
lawremi commented 12 years ago

As Hadley said, this is by design. Subsetting a mutaframe is going to create a filter that proxies the original mutaframe. All changes to the original are implicitly propagated to the subset (there is no copying of the data). The plan was to have a materialize() function that would copy a mutaframe. An important realization is that mutaframe is not simply a data.frame with reference semantics. It is designed for building a graphics pipeline.

There is a complication though that just occurred to me: there is no easy way to delete the proxy, since the parent has a reference. And this is a problem for all garbage collected environments (including Java and C#) when it comes to signals. The listening object will never be deleted until there are no other references to the emitting object. GTK+ and Qt do not have this problem, as they effectively have weak references. R also has the motion of weak references, internally, but they don't work at the R level, since there is no way to synchronize with the GC. I guess we will see if this becomes a performance issue. We could have proxies hold on to their listeners, and have some function like snip() that would remove all of the listeners.

Getting to your point, yes, I could see how the materialize() strategy is not ideal, given the above: a proxy would always be created. We could add such an argument (and this idea is discussed in the README), but in the big picture, we wanted to have much of the data.frame API automatically create such proxies. Things like transform() and aggregate() and split() come to mind. I am not sure if we want to pepper the API with a "copy" argument. Another idea would to have a cast like value() that would create a "valueframe" object, that wraps the original mutaframe, but uses different semantics. So you could do value(mf)[1,].

This would not be too different though than just doing an as.data.frame() on it, save the initial copy. My guess is that this is what you really want to do, at least for now. If it turns out to be super common for people to want to extract with '[' and make a copy at the same time, we could have an "as.data.frame" argument to '[', but that would be an optimization.

Michael

On Fri, Mar 30, 2012 at 10:32 PM, Yihui Xie < reply@reply.github.com

wrote:

They are not copied over. It is just that a new listener is added on the original mutaframe. See this example:

library(plumbr)

mf = mutaframe(x = rnorm(5), y = rnorm(5))
add_listener(mf, function(i, j) {
 print(plumbr:::changed(mf))
})

mf[1, 1] = rnorm(1)

mf2 = mf[1, ] # this will add a listener on mf

mf[1, 1] = rnorm(1)

and here are the results:

> mf = mutaframe(x = rnorm(5), y = rnorm(5))
> add_listener(mf, function(i, j) {
+   print(plumbr:::changed(mf))
+ })
>
> mf[1, 1] = rnorm(1)
Signal(i = , j = ) with 1 listeners
>
> mf2 = mf[1, ] # this will add a listener on mf
>
> mf[1, 1] = rnorm(1)
Signal(i = , j = ) with 2 listeners

After I subset mf to create mf2, there is an additional listener on mf:

> objectSignals::listeners(plumbr:::changed(mf))
$`1`
function (i, j, ...)
function (i, j)
{
   print(plumbr:::changed(mf))
}(i = i, j = j)(i, j)

$`2`
function (event_i, event_j, ...)
function (event_i, event_j)
{
   if (shape_changed(event_i, event_j)) {
       notify_listeners(child, NULL, NULL)
   }
   else {
       new_i <- match(event_i, i)
       incl <- !is.na(new_i)
       notify_listeners(child, new_i[incl], event_j[incl])
   }
}(event_i = event_i, event_j = event_j)(i, j)

Reply to this email directly or view it on GitHub: https://github.com/ggobi/plumbr/issues/5#issuecomment-4856137

yihui commented 12 years ago

Thanks for the detailed discussion. For now I think I will simply use the as.data.frame() approach.