johnmchambers / XRJulia

XR-style Interface to Julia (from "Extending R")
33 stars 2 forks source link

Error: Julia expression is too large #14

Open xiaojiemao opened 6 years ago

xiaojiemao commented 6 years ago

Hi, I found that once the matrix size is even moderately large, the XRJulia will give error:

library("XRJulia")
ev <- RJulia()
juliaCommand("using Base.LinAlg")
n = 500; p = 700; r = 3
U = matrix(rnorm(n*r), n, r)
V = matrix(rnorm(p*r), p, r)
X = U %*% t(V)
Xm = juliaSend(X)
juliaCall("svdfact", Xm)

Julia error: syntax: expression too large
Error: C stack usage  12608550 is too close to the limit

It seems that this can be got around by evaluating everything in Julia environment:

ev$Command("using Base.LinAlg")
ev$Command("n = 500; p = 700; r = 3")
ev$Command("U = randn(n, r); V = randn(p, r); X = U*V'")
ev$Eval("svdfact(X)")

But is this the only way to resolve the issue? I guess this would be very inconvenient if the large-scale input for functions in julia is computed from some other R functions. Thanks for your help!

scottlcarter79 commented 6 years ago

Hi, I am having the same issue, where trying to evaluate a julia call w. a matrix of 100K x 4 integers triggers this problem. I love the XRJulia package and would like to start using it heavily in my development work, however this issue is blocking me. Thanks

johnmchambers commented 5 years ago

The immediate problem is that any object sent directly to Julia uses JSON representation, which will be bulky and inefficient for large arrays. It's unfortunate the Julia parser isn't more robust to this, but in any case, for large enough arrays the translation will be too slow.

The fix is to use another intermediate form to transfer the data. For arrays, a binary file should be a simple choice. Both R and Julia know about reading/writing those and interpreting the result.

In R writeBin(object, file) will write the object. In Julia, from reading the documentation, if y is an array allocated with the desired type and dimensions and stream is a stream as returned by open(file), then read(stream, y) will return the array written out. I'll wrap this into a function for XRJulia. Meanwhile, you should be able to do a workaround directly by writing into a file whose name you know.

Eventually this might be done automatically for large vectors/arrays.

There is also a general principle that large objects should be passed infrequently and used through proxy objects, rather than appearing as arguments in function calls (section 13.6 of Extending R).

scottlcarter79 commented 5 years ago

Regarding the proxy objects idea - I tried using ev$Send( big_array ), but got the same error as when passing big_array through a proxy function call

johnmchambers commented 5 years ago

Of course. It still has to go through JSON. The point is that once you can send the large object, you should do it once and then refer to the proxy object in later calls.

johnmchambers commented 5 years ago

The version of XRJulia now on Github (version0.7.9) now uses binary reads for numeric or integer vectors of size > 1e4 bytes, including when they are part of a structure.

Reading a 1000 by 1000 matrix seems essentially instantaneous.

This version will likely just be on Github for a while. Would be good to add support for character data and have a similar arrangement for sending data back to R from Julia.

scottlcarter79 commented 5 years ago

Thanks! I'll give this a shot.