hrue / r-inla

This is the public repository for the r-inla project
GNU General Public License v2.0
87 stars 24 forks source link

Consider providing R Universe builds for INLA? #31

Closed njtierney closed 1 year ago

njtierney commented 3 years ago

Hi there,

I was wondering if you might be interested in providing R Universe builds for INLA?

The R Universe (https://ropensci.org/r-universe/) would provide CRAN-like binaries of INLA for windows + mac, and linux from source.

Obviously you've already got a setup at https://inla.r-inla-download.org/R/stable, but I just thought this might be of use/interest. Jeroen, the maintainer of the r universe project gave a talk about this recently at an rOpenSci community call: https://ropensci.org/commcalls/may2021-r-universe/

Thanks again for creating and maintaining INLA!

finnlindgren commented 3 years ago

Quite a bit of work would likely be needed to make this work, I think, but it might be a lot easier than getting it to work on CRAN!

hrue commented 3 years ago

CRAN will only work for the R part of the project in any case. R Universe is certainly very interesting!

hrue commented 3 years ago

I'm a bit hesitant to do all the migration work myself though, but I would certainly be happy to support anyone who wants to give it a try

finnlindgren commented 3 years ago

Later this summer (~July) I plan to take another stab at converting the rest of the old ad-hoc documentation system to roxygen (from what I recall of my work on this last year it was mostly the special interleaved documentation that looked like a challenge, but even that turned out that roxygen seemed to support).

From there, there are only a few minor remaining steps to get the package to build cleanly using standard R methods (and any remaining documentation pre-processing/generation more clearly encapsulated), given that a pre-processing step has placed the binaries in the needed place of the tree.

That would just leave the binary build step itself, and if that can also be "R-ified". With some outside help I think this can be done, if the R build system can tolerate standalone binaries and not just R-linked libraries. I'm currently considering my options for how to deal with the fmesher binaries for the separated fmesher package. the inla binary is more complicated, but the fmesher conversion may be a useful trial/template (but with help we can get it done sooner...).

njtierney commented 3 years ago

OK fantastic! Glad to hear you like the R Universe project idea. It is quite exciting, I think.

I have cloned this repo on my machine and tried to build the rinla folder as I would an R package - this seems to work well, although running devtools::build() doesn't quite work - I imagine there are a lot of intricacies that I am not familiar with. However the "build and reload" button /shortcut in RStudio seemed to build the package without error.

I'm not sure the best way to build INLA in an R Universe, I'll reach out to Jeroen to see the best way to do this, my initial naïve thought would be to create an organisation called "inla-test" and see if it will build there, but I think there should be a better way.

njtierney commented 3 years ago

Regarding the roxygen building, @finnlindgren - are you talking about this kind of documentation?

https://github.com/hrue/r-inla/blob/devel/rinla/R/agaussian.R#L1-L43

I wonder if Yihui Xie's rd2roxygen package might be able to help?

https://yihui.org/rd2roxygen/

finnlindgren commented 3 years ago

The challenge is that the R package mostly builds fine (with some issues around documentation files) but that the binaries currently need building as a separate pre-processing step, that places them in the needed place of the R source tree. Thus, the "source" package built therefore contains the pre-built binaries, so the "R binary" build really doesn't do anything other than keep the already built binary files. So to properly encapsulate the whole build in the traditional source/binary R package framework, the binary programme build step needs to be converted into the "R make" type structure.

njtierney commented 3 years ago

Interesting! What are the benefits of building it in this way? I've not come across this before, is it due to calling some C code or something low-level like that?

finnlindgren commented 3 years ago

Yes, I used rd2roxygen to do the heavy lifting, but there's also manual work needed for some of the remaining special documentation (it was written before roxygen was useful, and is sometimes interleaved with the code in a way that automated conversion doesn't easily handle. I wrote some grep/sed scripts that could handle some of it, but manual work is also needed.)

finnlindgren commented 3 years ago

Historical reasons. inla was a standalone program before the R-interface was written, and it's still a standalone program. It does now also link to R-lib for some features, but it it doesn't have a sufficiently cross-platform build script to handle all the needed external libraries it links with.

finnlindgren commented 3 years ago

So yes, inla is a laaarge c-program using a bunch of low-level computational libraries.

njtierney commented 3 years ago

Historical reasons. inla was a standalone program before the R-interface was written, and it's still a standalone program. It does now also link to R-lib for some features, but it it doesn't have a sufficiently cross-platform build script to handle all the needed external libraries it links with. So yes, inla is a laaarge c-program using a bunch of low-level computational libraries.

Ah I see! My apologies, I didn't realise it was it's own standalone program! Very cool.

Yes, I used rd2roxygen to do the heavy lifting, but there's also manual work needed for some of the remaining special documentation (it was written before roxygen was useful, and is sometimes interleaved with the code in a way that automated conversion doesn't easily handle. I wrote some grep/sed scripts that could handle some of it, but manual work is also needed.)

Ah I see, I can understand that would be a careful task. Are you looking for community contributions in helping with the documentation migration over to roxygen, or do you feel this is something that requires a light touch and better knowledge of the source code?

I'm a bit hesitant to do all the migration work myself though, but I would certainly be happy to support anyone who wants to give it a try

I don't have very much experience with C programming, so I'm not sure how much help/assistance I can provide, but let me know how I might be able to help. I'm developing an R package that uses inla, and currently I'm not sure how to manage this as I don't think Imports or Remotes fields in the DESCRIPTION file would work, but I think if the binaries get built with R Universe then this should be slightly easier to resolve.

finnlindgren commented 3 years ago

The documentation conversion is just "more of the same" of what I did last year, and would take longer to explain than to do myself; The R part of the package is fine (but automated testing probably reveals some easy-to-fix syntax things etc), including Imports etc, via roxygen and our old legacy system. It shouldn't be much work to "clean up" the remaining aspects of this, to make it fully devtools-compatible, for example.

It's really the "convert the various build files and C&C++ compiler options into an R-friendly cross-platform compatible build system" that needs more work/help.

For building packages that depend on INLA, take a look at inlabru, that is even on CRAN: https://github.com/inlabru-org/inlabru The DESCRIPTION file there refers to Additional_repositories:. Not all parts of the R system uses that information properly though; it should be more supported than Remotes, but I'm not sure if it is... To be allowed on CRAN, inlabru can only have INLA as Suggests, and all examples/tests that involve inla are protected by checks for whether it's available. Some CRAN systems unfortunately have outdated/broken INLA installations, so I had to turn off all inla tests on CRAN just to be sure it doesn't break for broken CRAN reasons. I have a full test suite running as github actions that does involve substantial inla calls.

Edit: adding the contents of Additional_repositories to the repos option does work for the most part; I have

r <- c(INLA = "https://inla.r-inla-download.org/R/testing",
         CRAN = "https://cloud.r-project.org/")
    options(repos = r)

in my .Rprofile, just as how the r-universe paths are used, but I'm not sure if Additional_repositories by itself always solves the problem of R finding where to install INLA from as a package dependency.

finnlindgren commented 3 years ago

Addendum: For the github actions testing for inlabru, I don't have anything special setup to install INLA, so presumably it's the Additional_repositories information that does the work there. The workflow file https://github.com/inlabru-org/inlabru/actions/runs/860530068/workflow just makes sure to ask for dependencies to be installed.

njtierney commented 3 years ago

Thanks for that, @finnlindgren !

That's great to know about the Additional_repositories line in the DESCRIPTION file, very neat.

ThierryO commented 2 years ago

Base R functions like install.packages() don't use the Additional_repositories. See the discussion at https://github.com/r-universe-org/build-source/pull/2. The inlabru workflow installs INLA because remotes::dev_package_deps() handles it.

Note that @jeroen tried to build INLA too. Maybe you can collaborate on making INLA build on r-universe.org.

finnlindgren commented 2 years ago

That's surprising since it was CRAN that mandated the use of Additional_repositories (plus to use Suggests: INLA)! But I presume they only install INLA on some of those systems, and don't automatically install packages from Additional_repositores, only check if they exist there or not.

Something seems to have recently changed in how the github actions deal with it, as I had to update the inlabru checking scripts to make it install INLA on Linux (it still worked on Mac and Windows). I don't recall the details of what I found out when I did those fixes though.

The two main problems with automated INLA building, apart from the practical file reorganisation needed in the repository, are 1) External library dependencies 2) Specific platform dependencies (Linux/Mac/OS) in the build scripts. fmesher is being migrated to a separate package where it will build and link directly into R with Rcpp, but inlaprog is a different beast.