Open eddelbuettel opened 9 years ago
As an addendum: it is pretty easy to go from (highly regularized CRAN) sources to binaries, eg I more recently started again to make use of the auto-builder / PPA infrastructure at Canonical / Launchpad: my PPA as the debian/
directory is typically fairly static (see eg for RcppArmadillo).
So the advantage over https://launchpad.net/~marutter/+archive/ubuntu/c2d4u would be more/better testing? Don't we already have that in the official CRAN repo?
It would be a reboot / possible re-use of c2d4u -- as well as an extension as Michael never aimed for all of CRAN.
We have, thanks to you, a good handle on sources. We have code and build-depends data in other places (c2d4u, Don's debian-r). We have gitbuilder
as well.
I think it may be worth a discussion how to put it all steroids,
Good points. I guess it is not too difficult to do it for the majority of the packages. Then we'll see if it is worth doing it for the ones that fail, how often they'll break, etc.
I'll be definitely happy to discuss this.
If I understand correctly, this would work from install.packages() and thus be available to install binaries of R packages on linux without needing root; particularly convenient for people on machines where they have to otherwise build from source or ask a sysadmin to install a binary(?)
On Thu, Mar 5, 2015, 8:28 AM Dirk Eddelbuettel notifications@github.com wrote:
It would be a reboot / possible re-use of c2d4u -- as well as an extension as Michael never aimed for all of CRAN.
We have, thanks to you, are good handle on sources. We have code and build-depends data in other places (c2d4u, Don's debian-r http://debian-r.debian.net).
I think it may be worth a discussion how to put it all steroids,
— Reply to this email directly or view it on GitHub https://github.com/ropensci/unconf/issues/25#issuecomment-77396425.
@cboettig Well, that is a good question. I think proper deb/rpm/etc packages is one option, and just plain binary R packages (i.e. a tgz/zip of the installed package) is another option. Both have pros and cons IMO.
@cboettig: No, I did not have rewriting install.packages()
in mind. In your and my parlance, it works at the apt-get
level. Like the previous cran2deb projects from which c2d4u and debian-r derived. If there were volunteers with sufficient chops to do rpm, OS X tgz, ... then we could do those too.
Right now it is simply my quest to convince @gaborcsardi that his GitHub mirror is a match made in heaven for gitbuilder :)
@eddelbuettel What I meant, the options are
apt-get
.install.packages
(assuming install.packages
still supports binary packages on Linux, I guess it does).@gaborcsardi Well: 1) gives you proper package management at the OS level covering the depends R does not know about 2) does not
But looks like we're having a discussion. Good :)
@eddelbuettel Sure, that's advantage of 1). But 1). needs root access and 2) does not. That's advantage 2).
You are right that 2) might lead to non-loading packages. But you can wrap install.packages
, such that it checks for the proper Debian/Ubuntu packages, and tells the user to tell the admin to install them. If the sysadmin removes the needed system libraries later, the package will still not load, but this is something you can't do much about.
Also, you are right that R does not (properly) know about these deps, but we'll need to know about them to build the packages, anyway. So we need some kind of mapping from SystemDependencies
to Ubuntu/etc. packages.
But I am not arguing for either 1) or 2). Or 3). :)
Right now it is simply my quest to convince @gaborcsardi that his GitHub mirror is a match made in heaven for gitbuilder :)
Hah! You can convince me about that. But IMHO it is still the case that most Linux users install from source, and binaries are much more useful for windows and OSX users.
Btw. do you have a link about what gitbuilder actually does? Do we need to put debian/
directories in the packages?
Btw. do you have a link about what gitbuilder actually does?
Never mind, found it above.
Point taken on root access, but also consider that there is so much automated and scripted use these days where we consume .deb files anyway: Travis CI, Docker, ...
@eddelbuettel Agree about Docker. (It would be great to have a poll about how many people actually use Docker.)
[Btw. how about putting polls on r-project.org? Also asking @hadley here. It would be great to know (roughly) how many people install from source on OSX/Windows/Linux, how many use Docker, etc.... source and binary installs can maybe estimated from the CRAN download logs, but that also contains automated stuff, and hard to infer actual users, etc.]
As for Travis, it'd better not to use debs, actually, because that needs root access (and sudo), which means slow check times and no caching. No question, debs are way easier, and if you need extra software, then you need sudo anyway, but almost all packages don't need extra system software.
[I doubt polls will fly on the homepage. RStudio users are 70% windows, 20% mac and 10% linux - I suspect R users in general weight a bit more heavily towards linux, but I doubt by a huge amount.]
Disagree on Travis. I now build things I need more often via my PPA and I have the Travis times to prove that it is faster to install r-cran-$foo
as a deb than from source. (But that is so obvious that you may have meant something else; this is what I meant.)
@hadley Useful numbers, thanks.
But one could also argue that 100% of R developers using Travis use Ubuntu 12.04 there along with a number of .deb package, just how 100% of Rocker users use Debian testing and .deb packages in their container.
What I mean is that if we finally manage not to use sudo
(and thus apt-get
) on Travis, then we can use the new Docker based Travis, and Travis caching. Which will speed things up. Most probably.
(Re the off-topic Travis tangent: I see. And per @craigcitro, that is coming. But no matter what the base image, when you want to add anything it is faster to add a binary that is prebuilt.)
How do R packages specify their non-R dependencies? I maintain Linuxbrew, the port of Homebrew to Linux. It could be useful for installing those non-R dependencies. It does not require root access.
@sjackman: Every approach to this problem I am aware of does a local mapping. What one distribution calls libpostgresql-dev
is libpg-dev
somewhere, pg$VERSION-dev
somewhere else and so on. So limited usefulness or portability of one solution to another. Also toolchains differ etc pp so this hard to solve universally.
And if I may, the does not require root access is a bit of a straw man. If we were happy to install below $HOME I wouldn't need it either. But we aim for use in installations via the system tools -- as done in Docker, Travis, derived distributions etc pp -- and that is a tad more involved.
Science/HPC users often don't have root access. No one at my institution does. We install in $HOME
.
Nothing wrong with homebrew, and I appreciate all the work you are doing there.
But I happen to not be motivated by that use case. I am admin of my HPC systems, and have supported HPC use for long enough to know that that is not an isolated case either. Both usages exists, and we simply serve different users, or even just different machine pools of the same users. And by all means if what we do here is of use to you, do feel free to use it. As I said above, I doubt it will be all that easy to generalize. But we can talk more at the unconference.
My experience too is that linux admins are moving to vm/container solutions to provide their users with an unrestricted yet isolated environment to do their work. Nobody likes old fashioned user-role security with bureaucratic policies on how to beg the admin to install some system software.
So with the future in mind, I also think the "no root access" is not a very important use case.
When an R package depends on a system library, how is that library installed? Is it up to the user to use the native package manager to install those dependencies?
@sjackman Yes, except on windows, where static libs are included in the binary. At least in most cases.
@sjackman: Please define "system library". I would agree on "up to the user" but not on "native package manager" as not all OSs have one worth its salt. See "R Installation and Administration" for what R (Core) has to say about the R context.
In general, you can't assume anything which is why this hard. Two orthogonal approach:
By "system library" I meant a non-R library that is required by a R package. By "native package manager" I meant apt/yum.
I'm surprised that on Mac OS I've never seen an install.packages
build fail due to a missing system library. Have I just been lucky?
1) "system" commonly refers to the OS, so to me a "system library" is libc. 2) The term I would use is "external library" to stress that it is not part of / comprised by the R package. 3) Just try something a tad further from the mainstream. RProtoBuf or RQuantLib are examples among those I maintain; RSymphony would be one by Kurt.
On an APT system, the difference between a system library such as glibc and an external library such as protobuf is pretty small. I like the term external library all the same.
Awesome. Thanks for the example.
> install.packages("RProtoBuf")
…
checking google/protobuf/stubs/common.h usability... no
checking google/protobuf/stubs/common.h presence... no
checking for google/protobuf/stubs/common.h... no
configure: error: ERROR: ProtoBuf headers required; use '-Iincludedir' in CXXFLAGS for unusual locations.
ERROR: configuration failed for package ‘RProtoBuf’
* removing ‘/usr/local/Cellar/r/3.1.3/R.framework/Versions/3.1/Resources/library/RProtoBuf’
Warning in install.packages :
installation of package ‘RProtoBuf’ had non-zero exit status
Every approach to this problem I am aware of does a local mapping. What one distribution calls libpostgresql-dev is libpg-dev somewhere, pg$VERSION-dev somewhere else and so on. So limited usefulness or portability of one solution to another. Also toolchains differ etc pp so this hard to solve universally.
Is there any particular solution to this problem that's popular or widely used?
R could install the necessary external library dependencies such as protobuf transparently to the user if it were integrated with a portable package manager, such as Homebrew. Note Homebrew handles Mac OS and Linux, but not (yet?) Windows.
On an APT system, the difference between a system library such as glibc and an external library such as protobuf is pretty small. I like the term external library all the same.
I disagree, and strongly for that matter:
configure
error such as the one you showed above for RProtoBuf
, I'd be a rich man. install.packages()
still does not know about it.In sum, this simply is a hard problem and I do not think there are any easy outs or answer. But I look forward to you trying to convince me otherwise in person in two days ;-)
I've created a very thin R package over the Homebrew command line client brew
.
https://github.com/sjackman/homebrewr
See also https://github.com/ropensci/unconf/issues/34
> brew_install("protobuf")
==> Downloading https://homebrew.bintray.com/bottles/protobuf-2.6.1.yosemite.bottle.1.tar.gz
==> Pouring protobuf-2.6.1.yosemite.bottle.1.tar.gz
==> Caveats
Editor support and examples have been installed to:
/usr/local/Cellar/protobuf/2.6.1/share/doc/protobuf
==> Summary
🍺 /usr/local/Cellar/protobuf/2.6.1: 81 files, 7.1M
brew_install("hello")
brew_remove("hello")
brew_update()
brew_upgrade()
Following the discussion, here is a first sketch of what the mappings of system requirements could look like: https://github.com/metacran/sysreqs Please comment or fix if your use case is not covered or I did something stupid. Also, happy to give you direct write access to the repo.
If you want to take a look at all SystemRequirements
fields for all versions of all CRAN packages (updated regularly), here is a quick way:
http://crandb-dev.r-pkg.org:8080/-/sysreqs
Output is JSON, so you might need a JSON browser extension, or parse with jsonlite
, etc.
Thanks so much for this.
@gaborcsardi has done a wicked job with the CRAN mirroring at the GitHub MetaCRAN repo. There are some lofty plans somewhere to provide more binaries that just for Windoze. Debian has gitbuilder which can use a git repo as backend, and distro-specific files just end up in a branch.
Maybe we can start some experimentation towards using this for Debian and/or Ubuntu. Benefits would be