rstats-go / proposal

Proposal document about ergo
MIT License
22 stars 7 forks source link

Other go-R efforts #3

Open dpastoor opened 6 years ago

dpastoor commented 6 years ago

I figured starting an issue noting other go-R efforts might be helpful, if also to alert those people and possibly bring them into the fold

dpastoor commented 6 years ago

@dareid has been working on a go Rserve client at https://github.com/senseyeio/roger

romainfrancois commented 6 years ago

Thanks @dpastoor

dareid commented 6 years ago

Hi @romainfrancois , @dpastoor. Thanks for pinging me.

I am also aware @glycerine was working on https://github.com/go-binder/rmq, but not sure how progress was made in that project.

glycerine commented 6 years ago

Hi @dareid, thanks for the mention. What is your question? rmq has been done for ages, and works great. That is, for its scope. What are you trying to do?

glycerine commented 6 years ago

Oh, sorry. My repo was set to private. Hmm, don't recall why I did that, but I flipped that bit and now its public.

See https://github.com/glycerine/rmq, that's the actual URL, discussing the finished state. I'm not sure who owns the go-binder account, but it was from an earlier fork of rmq, and its not my account.

So I just read your rstats-go proposal front page. rmq takes care of most it. Translation from Go stuctures to R is automatic, and vice-versa. rmq also provides a websocket server so that one can call R code from, well, anywhere.

romainfrancois commented 6 years ago

Thanks. I’ll have a look at rmq.

romainfrancois commented 6 years ago

So, @glycerine I don't think our scopes are identical, and both projects can probably coexist.

As I understand it, rmq uses client/server protocols, whereas ergo aims at mesh R and go in the same place, so that they can share data avoiding copies as much as possible, e.g. an R numeric vector would be seen by the go code as a go slice, and perhaps down the line with some ALTREP 🎩 go arrays or slices can be handed to R, etc ...

Besides (probably something minor) I could not get rmq to install on my mac.

What we are after with ergo is the ability to write a package with a go directory somewhere in the hierarchy, maybe src/go or perhaps inst/go with idiomatic go code that perhaps uses R specific structs and interfaces, and then just call it, similar to .C or .Call, or similar to e.g. Rcpp attributes.

kylebarron commented 6 years ago

@romainfrancois in case you hadn't seen, a Go implementation of Apache Arrow was added to the apache/arrow repository this week. Maybe that could be a good way to transfer data efficiently between R and Go if you finish your R bindings?

romainfrancois commented 6 years ago

yes, I've seen the announcement. For now, I am dealing with each project independently.

glycerine commented 6 years ago

As I understand it, rmq uses client/server protocols,

I don't think you looked very carefully, if you came to that conclusion.

whereas ergo aims at mesh R and go in the same place, so that they can share data avoiding copies as much as possible, e.g. an R numeric vector would be seen by the go code as a go slice, and perhaps down the line with some ALTREP 🎩 go arrays or slices can be handed to R, etc ..

rmq provides embedding. In addition, on top of that, it provides websocket based client/server.

rmq uses msgpack to serialize data. It uses Go and R's reflection capabilities to do that conversion. While you are discussing a proxying rather than copying based approach, I did not find that particularly viable.

Besides (probably something minor) I could not get rmq to install on my mac.

I can't help you if you don't provide details. Please file an issue or more detail here.

glycerine commented 6 years ago

What we are after with ergo is the ability to write a package with a go directory somewhere in the hierarchy, maybe src/go or perhaps inst/go with idiomatic go code that perhaps uses R specific structs and interfaces, and then just call it, similar to .C or .Call, or similar to e.g. Rcpp attributes.

rmq is itself both a library and a demonstration of that library providing layers on top of the base services (of data exchange between Go and R). It builds into a .so shared library whose functions are then called with .Call, and can be required inside R.

Example of .Call https://github.com/glycerine/rmq/blob/master/R/rmq.R#L149

For instance, here is a function, written in Go, callable directly from R. That go function happens to provide a webserver, but that is just an example.

https://github.com/glycerine/rmq/blob/master/src/rmq/rmq.go#L113

func ListenAndServe(addr_ C.SEXP, handler_ C.SEXP, rho_ C.SEXP) C.SEXP {

    addr, err := getAddr(addr_)

    if err != nil {
        C.ReportErrorToR_NoReturn(C.CString(err.Error()))
        return C.R_NilValue
    }
...

and here is how conversion of strings is done, as a second example.

func getAddr(addr_ C.SEXP) (*net.TCPAddr, error) {

    if C.TYPEOF(addr_) != C.STRSXP {
        return nil, fmt.Errorf("getAddr() error: addr is not a string STRXSP; instead it is type %d. addr argument must be a string of form 'ip:port'\n", C.TYPEOF(addr_))
    }

    caddr := C.R_CHAR(C.STRING_ELT(addr_, 0))
    addr := C.GoString(caddr)

    tcpAddr, err := net.ResolveTCPAddr("tcp", addr)
...

-- https://github.com/glycerine/rmq/blob/master/src/rmq/rmq.go#L41

glycerine commented 6 years ago

@romainfrancois

I wanted to make sure to convey that I have huge respect for your contributions in the R world. I add this text to convey that I'm not trying to deter but rather applaud your and enhance your work on using R and Go together.

That is, I fully support your efforts to do more Go and R integration!

I provide these details and explanation around rmq in hopes that you can leverage the work already done, so you don't need to repeat it or rediscover the issues overcome already (like unix signal handling within Go code embedded in R, ouch that was tricky).

Some kind of serialization is needed, whether a copying approach or proxying approach is used. And likely a combination of the two would end up being most useful. I've found msgpack2 to be a very nice format, efficient and compact. And I have written extended versions to enhance its integration with Go code via go:generate's codegen capabilities.

In particular, see my https://github.com/glycerine/greenpack repo for enhanced msgpack2 serialization that fully handles Go's interfaces, pointers, and de-duplicating circular graphs for serialization. It does this via a codegen approach that is really great for catching usage error early (using the static type checking that Go is terrific at).

A great next step in the direction to enhance the integration of R and Go would be to enhance the R side of the rmq reflection code to handle decoding (or, in fact, proxying) of Go's interfaces, as encoded by greenpack (which is just a set of conventions in msgpack2). And perhaps, if you are ambitious, and so inclined, to explore how to convert S3/S4 objects to something usable from Go.

For reference, the code that converts an R sexp to a Go interface{} is here https://github.com/glycerine/rmq/blob/master/src/rmq/rmq.go#L815

// SexpToIface() does the heavy lifting of converting from
// an R value to a Go value. Initially just a subroutine
// of the internal encodeRIntoMsgpack(), it is also useful
// on its own for doing things like embedding R inside Go.
//
// Currently VECSXP, REALSXP, INTSXP, RAWSXP, and STRSXP
// are supported. In other words, we decode: lists,
// numeric vectors, integer vectors, raw byte vectors,
// string vectors, and recursively defined list elements.
// If list elements are named, the named list is turned
// into a map in Go.
func SexpToIface(s C.SEXP) interface{} {
...
romainfrancois commented 6 years ago

Thanks @glycerine. I have had some more time now, and indeed there are things we can (at least) conceptually borrow.

siddharthab commented 6 years ago

Hi @romainfrancois, I just came across ergo (and also rmq through this issue) this morning. I did not know you had made so much progress in the last three months. Based on your efforts in June/July last year, I knew that in principal, all this was possible, and so I was inspired to extend that work in March/April this year.

I am presenting my work at a local R-Users meetup tomorrow. I look forward to collaborating with you and, if possible, for the two efforts to merge.

I will share my presentation and code soon.

siddharthab commented 6 years ago

My slides are on Google Drive and my code is on Github.

It's roughly the same approach as what you are thinking. I basically dump in a go file in the package's src directory, and the author's own go code can then use the data type adapter functions provided. I still have to clean up the naming convention, organization and documentation.

glycerine commented 6 years ago

@siddharthab Here's the canonical rmq repo: https://github.com/glycerine/rmq

It is a fairly complete solution.

siddharthab commented 6 years ago

Thanks @glycerine. I corrected the link in my slides.