stefan-m-lenz / JuliaConnectoR

A functionally oriented interface for calling Julia from R
Other
100 stars 6 forks source link

Guide for package developers #24

Closed yjunechoe closed 1 year ago

yjunechoe commented 1 year ago

First off, thanks for this amazing package! I had a very pleasant experience using it to develop my package jlmerclusterperm.

As I was navigating this new territory of using JuliaConnectoR to develop R and Julia simultaneously (and wrapping all that into an R package), I was wondering whether there are any "best practices" of sorts when it comes to using JuliaConnectoR for this purpose. JuliaConnectoR doesn't advertise its package-development-related use cases much, but I think many would benefit from this (especially first timers like myself).

Long story short - it'd be nice to have a "how to" vignette of a sort for Julia-based R package development. This could range from basic tips like "put julia scripts in /inst" to more specific ones (documenting and testing examples, submitting to CRAN, etc. - I'm personally interested in these issues myself). I suppose it could even be in the form of a minimal package demonstrating the how-to's, similar to how the R Packages book does it with its toy packages.

Thanks in advance for considering my selfish request! And I'd also be happy to contribute to this to the best of my abilities.

wleoncio commented 1 year ago

I've been trying to encourage people to consider JuliaConnectoR instead of Rcpp in their R package development, so I'm very much interested in seeing (or contributing to) this as well.

Regarding the location of Julia scripts, from what I've read I understand the exec subfolder is more appropriate than inst. Not really sure where to put calls to juliaCall(), juliaImport(), etc., or even if those are the proper functions to use in that context. All I know is I'm doing something wrong because performance of my imported Julia functions in R are way below what I get in the Julia REPL.

At the time of writing, JuliaConnectoR has 6 reverse imports. I've checked the source code for each of those packages and all are either importing published Julia packages or writing their Julia functions as strings inside R scripts using juliaEval(), which doesn't look ideal for coding or debugging, and doesn't seem to provide and performance advantages as far as I could tell.

yjunechoe commented 1 year ago

Regarding the location of Julia scripts, from what I've read I understand the exec subfolder is more appropriate than inst.

Ha! I think this is yet another reason to have a centralized "guide" for developers because I got the inst/ recommendation(?) from the R Packages book which says:

  • exec/: for executable scripts. Unlike files placed in other directories, files in exec/ are automatically flagged as executable. Empirically, to the extent that R packages are shipping scripts for external interpreters, the inst/ directory seems to be preferred location these days.

I could be totally wrong but again it'd be nice to be told I'm wrong in a guide for JuliaConnectoR-based pkg devs😆

wleoncio commented 1 year ago

Yeah, the documentation is conflicting, and I feel like this kind of interaction is new territory even at the top rung of R, where C/C++ and FORTRAN are still the go-to second languages for package development.

Not sure if @stefan-m-lenz has considered adding a "Discussions" tab to this repo, but that may be a good place to exchange ideas about this until enough JuliaConectoR-based packages are on CRAN for a standard to emerge.

stefan-m-lenz commented 1 year ago

Hi @wleoncio and @yjunechoe, thank you for your interest in the package and for the suggestions! The last two weeks I have been on a vacation without access to a computer, so my response took a while. Sorry for that.

The idea with an example package seems very good.

Maybe we could try to identify questions that should be answered by a guide and collect them? From reading your posts I think one of the first question is where to put the Julia code. When developing the JuliaConnectoR, I put all my Julia code in inst/Julia and had all the Julia code in a Julia module inside this folder. I think for most use cases it is the best practice to create a separate Julia package and import and use it directly via juliaImport. If this is not an option, this could be suggested as standard? I actually wasn't aware that there are several CRAN packages that import the JuliaConnectoR. I will have to check out what these guys have been doing!

Thanks for pointing to the Discussions feature of GitHub. I will check it out as well.

wleoncio commented 1 year ago

Nice to know you're available for this, Stefan!

I think suggesting a standard for the location of the .jl files is a good start, even if things change later as we learn about the implications of our recommendations.

Personally, I am mostly concerned about performance (see #21 for an earlier discussion). I'm trying to make a case for Julia as an alternative to C++ for translating slow R code, but apparently I'm still struggling to minimize communication between R and Julia (my Julia translations are as slow as their R counterparts in the package, but much faster from the REPL). So some guidance on that would be very welcome.

yjunechoe commented 1 year ago

@stefan-m-lenz Thank you for considering!

Since I opened the issue, if I may summarize some our discussion here into more concrete next steps:

1) Example R package using JuliaConnectoR. This can simply demonstrate the best practices in its source code, but can explicitly touch on some other topics you raised too as vignettes, like how/whether to wrap custom Julia code into modules to be called by JuliaConnectoR. I had actually not thought about using modules but this makes sense, as that minimizes the differences between using JuliaConnectoR as an interface to existing libraries vs. custom Julia scripts. As a first step we should determine a good minimal example problem to solve that's simple, interesting, and worthwhile to implement in Julia to be called from R.

2) Tips for improving performance. This could be a vignette to JuliaConnectoR or be explained as part of making the example package, but depending on its scope it might have to be a separate topic about the internals. I haven't faced much performance issues in my experience (of wrapping/extending existing Julia-based solutions), so I'll leave that as outside the scope of this specific issue (especially since this seems to be of interest for folks trying to translate slow R code into Julia, and that topic is probably a beast of its own). Though the general idea of "take something that exists in R, implement it in Julia, and call it in R via JuliaConnectoR" does make a good case study for the example R package.

3) Consider using the Github Discussions feature (which I'm new to myself and we can leave this to you to determine as the repo owner)

I am most interested in the first myself (making an example R package), so perhaps I should start a new issue specifically for that? And once the idea for the package is settled the discussion could move into a new repo for the example package itself.

Please feel free to close this issue if/once the appropriate next steps have been ascertained from my vague original request!

stefan-m-lenz commented 1 year ago

Hi @wleoncio and @yjunechoe,

I am truly grateful for your engagement and the insightful discussion. Your interest and contributions have the potential to push the JuliaConnectoR package forward.

Unfortunately, I am currently engaged in other projects which demand most of my attention and lead me to late replies. However, this does not, in any way, imply that your suggestions are less valued or unimportant.

Given my current commitments, I encourage you to begin making initial contributions towards these next steps. It would be fantastic to see some progress in these areas. As soon as I get the opportunity, I will be more than happy to join in more actively. In the meantime, please feel free to continue the conversation, share your ideas, and make initial steps towards implementation!

Regarding the speed, the JuliaCall package will certainly always be faster as it operates with C bindings and does not rely on TCP. When developing the JuliaConnectoR, I chose TCP because JuliaCall did not work reliably, especially with the fast release cycle of Julia. I am currently not sure what the status of the JuliaCall package is and how well it is maintained. The great advantage of the connection via TCP is that it can be maintained very easily, also across different versions. But this comes at the price of a slower communication between R and Julia.

yjunechoe commented 1 year ago

Thanks for sharing your thoughts!

The main purpose of this issue was to ask whether you would be open to ("third party") contributions for a guide to JuliaConnectoR (namely as a demo package but also via other means). From our discussion so far I take it that you are okay with this. I will go ahead and close this issue as I have gotten that the answer I need.

I myself am juggling other projects as well so it may take some time before I give the demo package a try, but hopefully the discussion here will serve as permission/encouragement for others to work on this as well if they had the same idea and come across this issue.

As an aside, I am currently in the process of getting jlmerclusterperm on CRAN. I am documenting the process in detail - I plan to share it publicly somewhere once I'm done, and if you get around to enabling discussions for this repo I'd be happy to contribute there as well.

Cheers!