JuliaInterop / RCall.jl

Call R from Julia
Other
319 stars 59 forks source link

deal with non-standard evaluation #42

Open randy3k opened 9 years ago

randy3k commented 9 years ago

@simonbyrne Thanks for the IJulia hooks. The implementation makes RCall.jl more usable.

I see that Julia objects are now implicitly converted to Sexp objects. And we are seeing crazy axis labels (in graphics.ipynb) because there are no R labels associating the numbers. All the numbers are directly passed to the plot function without masking them with a label.

> X = linspace(0,pi,10)
> rprint(RCall.lang(:plot, sexp(X), sexp(sin(X))))
plot(c(0, 0.349065850398866, 0.698131700797732, 1.0471975511966, 
1.39626340159546, 1.74532925199433, 2.0943951023932, 2.44346095279206, 
2.79252680319093, 3.14159265358979), c(0, 0.342020143325669, 
0.642787609686539, 0.866025403784439, 0.984807753012208, 0.984807753012208, 
0.866025403784439, 0.642787609686539, 0.342020143325669, 1.22464679914735e-16
))

To avoid this, we could pass the Julia objects to R and then plot with the R objects.

X = linspace(0,pi,10)
globalEnv[:X] = X
globalEnv[:Y] = sin(X)
rcall(:plot, :X, :Y)

Actually, I am thinking of another way to convert Julia objects. For example, for each rcall call, we create a new environment via new.env(), if an argument is not a Sexp, we convert it to an R object and stored it in the new environment. Perhaps something like

X = linspace(0,pi,10)
E = rcall(symbol("new.env"))
E[:X] = X
E[:Y] = sin(X)
rcall(:plot, :X, :Y, env=E)

PS: in graphics.jl, rprint is used to "print" the rcall. Actually, it is not necessary.

simonbyrne commented 9 years ago

I don't think there's going to be an easy solution that will work for all functions which utilise non-standard evaluation. For plot, I think it might be easier to create a macro that automatically generates labels:

macro rplot(x,y,args...)
    :(rcall(:plot,$x,$y,xlab=$(string(x)),ylab=$(string(y)),$args...))
end
julia> macroexpand(:(@rplot(x,sin(x))))
:(rcall(:plot,x,sin(x),xlab="x",ylab="sin(x)",()...))
randy3k commented 9 years ago

There is also a related issue when doing lm. (PS: please checkout current master for DataFrame conversion)

using RCall
using RDatasets
mtcars = dataset("datasets", "mtcars");
rprint(rcall(:lm, "Disp~MPG", data=mtcars))

It returns

julia> rprint(rcall(:lm, "Disp~MPG", data=mtcars))

Call:
lm(formula = "Disp~MPG", data = structure(list(Model = c("Mazda RX4", 
"Mazda RX4 Wag", "Datsun 710", "Hornet 4 Drive", "Hornet Sportabout", 
"Valiant", "Duster 360", "Merc 240D", "Merc 230", "Merc 280", 
.
.
.
4L, 4L, 4L, 4L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 3L, 3L, 3L, 
3L, 3L, 4L, 5L, 5L, 5L, 5L, 5L, 4L), Carb = c(4L, 4L, 1L, 1L, 
2L, 1L, 4L, 2L, 2L, 4L, 4L, 3L, 3L, 3L, 4L, 4L, 4L, 1L, 2L, 1L, 
1L, 2L, 2L, 4L, 2L, 1L, 2L, 2L, 4L, 6L, 8L, 2L)), .Names = c("Model", 
"MPG", "Cyl", "Disp", "HP", "DRat", "WT", "QSec", "VS", "AM", 
"Gear", "Carb"), class = "data.frame", row.names = c(NA, 32L)))

Coefficients:
(Intercept)          MPG  
     580.88       -17.43  
simonbyrne commented 9 years ago

Eesh, that's messy.

randy3k commented 9 years ago

This may be a better way to resolve the lazy evaluation problem. It uses R_mkEVPROMISE to create a promise object first.

using RCall
RCall.rgui_start()

macro promise(x)
    sym = Expr(:quote, symbol(string(x)))
    quote
        RCall.preserve(sexp(ccall((:R_mkEVPROMISE,libR),Ptr{Void},(Ptr{Void},Ptr{Void}), sexp($sym), sexp($x))))
    end
end

x = linspace(0,pi,10)
rcall(:plot, (@promise x), (@promise sin(x)))

EDIT: This workaround can correctly handled substitute but not match.call(). Therefore, it doesn't solve the lm issue.

reval("""
       f <- function(x) substitute(x)
       g <- function(x) match.call()
""");
rprint(rcall(:f, @promise x))  
# x
rprint(rcall(:g, @promise x))
# g(x = c(0, 0.111111111111111, 0.222222222222222, 0.333333333333333, 
# 0.444444444444444, 0.555555555555556, 0.666666666666667, 0.777777777777778, 
# 0.888888888888889, 1))
simonbyrne commented 9 years ago

Nice! That's really cool.

Do you know what is the difference between R_mkEVPROMISE and R_mkEVPROMISE_NR?

randy3k commented 9 years ago

It is used to disabled reference counting so that gc() will collect the corresponding memory. In default, R is using a different way to do reference counting. It is not relevant unless R is complied with SWITCH_TO_REFCNT.

randy3k commented 9 years ago

I am trying an lazy macro with the hope of handling non-standard evaluations. The idea is to copy the Julia objects to R in a sandbox environment. And then evaluate the function call in that environment.

macro lazy(expr)
    blk = Expr(:block)
    cleanup = Any[]
    push!(blk.args,:(env = newEnvironment(rGlobalEnv)))
    (expr.head == :call && expr.args[1] == :rcall) || error("expect rcall(f, args...)")
    args = copy(expr.args)
    shift!(args)
    for (i,a) in enumerate(args)
        if typeof(a) == Symbol
            push!(blk.args,:(env[$(QuoteNode(a))] = sexp($(esc(a)))))
            args[i] = QuoteNode(a)
            push!(cleanup,:(env[$(QuoteNode(a))] = rNilValue))
        elseif typeof(a) == Expr && a.head == :kw && typeof(a.args[2]) == Symbol
            value = a.args[2]
            push!(blk.args,:(env[$(QuoteNode(value))] = sexp($(esc(value)))))
            args[i].args[2] = QuoteNode(value)
            push!(cleanup,:(env[$(QuoteNode(value))] = rNilValue))
        else
            args[i] = :($(esc(a)))
        end
    end
    ret_call = :(ret = reval(rlang_p(),env))
    append!(ret_call.args[2].args[2].args, args)
    push!(blk.args, ret_call)
    append!(blk.args, cleanup)
    push!(blk.args,:(ret))
    blk
end

With this macro, I get this

julia> using RCall

julia> rmtcars = dataset("datasets", "mtcars");

julia> @lazy rcall(:lm, reval("as.formula('HP~MPG')"), data=rmtcars)
RCall.RObject{RCall.VecSxp}

Call:
lm(formula = HP ~ MPG, data = rmtcars)

Coefficients:
(Intercept)          MPG  
     324.08        -8.83  
simonbyrne commented 8 years ago

I think we should be able to do something like that inside the @R_str macro.

My only question is if this breaks other functions which use this (e.g. update)?

randy3k commented 8 years ago

I think the purposes of @R_str and the @lazy macro are little bit different.

@R_str parses string to expressions. We need to evaluate expressions under the global environment to make sure that expressions like x <- 1 or a[3] <- 2 return expected results. If we create a sandbox environment to evaluate x<-1, it only changes a local variable x and the value the global variable x is kept untouched.

On the other hand, @lazy takes only function calls as input. It does not touch any R global variables, it means that @lazy calls could be separated and sandboxed.