Open ajbouh opened 3 years ago
I think this looks like a reasonable feature, but would like to know more.
OS
and TARGET_OS
are both whatever OS you're on?could you lay out the use case and alternatives more clearly? when is this necessary? what other reasons for doing this?
Yes, this requires having a linux compiler available on your Mac and setting those environment variables for CXX_TYPE
, etc. accordingly. I've had a reasonably good experience using zig cc
for this.
Also yes, the expectation is that TARGET_OS would default to the same as OS.
The ideal scenario would be to statically link libtbb (supposedly possible with big_iron.inc
) into the built program, so you don't have to move around more than one artifact and worry about their relative paths to each other, but that would be a larger change.
As an aside, before arriving at this patch I experimented with using Bazel to build cmdstan and all its dependencies (boost, stan_math, etc.), but I had a lot of difficulty navigating the impact of more granular compilation on c++ static initializer ordering. The compiled program would seem to run but then segfault because either tbb or its stan counterparts weren't initialized properly.
@rok-cesnovar @wds15 and @syclik - thoughts here?
The ideal scenario would be to statically link libtbb (supposedly possible with big_iron.inc) into the built program
is this feasible, given the configurability we have w/r/t running a Stan executable? e.g., env vars STAN_MPI and STAN_OPENCL?
My initial reactions: cool and yikes.
Early on we had cross compiling available. It's really hard to maintain using make. It's not that it's not feasible to do... in the future, someone may forget to check to make sure the cross compiling ability is available.
@ajbouh: 2 questions come to mind
In principle, there's nothing wrong with what you're proposing, but given that this is the only ask for cross compiling (in my memory) ~7 years, I'd lean on not trying to add more indirection into the makefiles to make it complicated.
Based on the diff, it looks like if there was a way to set this properly, we'd be ok? tbb_os=$(TBB_OS)
Is that right?
Statically linking the TBB is strongly discouraged by the TBB folks. So I would not ever do that. Technically it is possible, but not recommended at all from Intel.
Wrt. To cross-compilation: Sounds cool, but I have no clue on it as I have never done that.
This is a timely issue! I am currently attempting to build CmdStan as an artifact for use in the Julia ecosystem here. Stan has existing wrappers in Julia but providing CmdStan as an artifact in Julia's package manager is quite desirable (so users can just ]add CmdStan
and things will work out of the box).
BinaryBuilder, which is used to build artifacts for Julia, is a cross compilation environment based on Alpine Linux. The goal is to use a single script to build artifacts for every supported Julia platform and provide those binaries/artifacts as painlessly as possible to Julia users.
I obviously have the option to use the prebuilts and provide them as artifacts. But that's less desirable than cross compilation support if possible! Just thought I'd add our use case as support for cross compilation.
@Wimmerer - just to let you know that Stan.jl dev Brian Parbhu and I are working on the CmdStan install process, but focussing on the pre-builts. happy to discuss your use case.
Based on my tests these are the only places that need to change, though you do need to set a lot of environment variables.
The primary problem is that OS is used in two ways right now:
If you wanted, you could sort of invert the patch and let me specify my own path to stanc, and expose tbb_os. I don't remember if there would need to be more.
FWIW, I think zig cc
makes cross compilation much more achievable than it has historically been. No need for a separate system root or anything like that.
@mitzimorris the end goal is simply to support Julia's Stan wrappers without forcing users to install Stan separately. Rather than requiring users to specify a path to a CmdStan or stanc install users would simply ]add Stan
, which would install Stan.jl
. Stan.jl
would depend on an artifact package called CmdStan_jll
. This package essentially just unpacks platform specific tarballs built by our build scripts.
We generate that package using a build script in the JuliaPackaging/Yggdrasil repository. This functions much like a normal package manager, except there's a serious commitment to being cross platform. This means we try and have build scripts that support OS's Windows, Mac, Linux, FreeBSD; architectures i686, x86_64, aarch__, and a few others like PowerPC; as well as musl libc. The idea being, if you want to use a Julia package which depends on binaries (and additional source directories and executables) installation should just work.
In an ideal world I would have the following binary packages:
I'm missing dependencies, but not all of them have to be jll's, Stanmath_jll could use it's own internal versions of TBB and sundials for instance. By building these ourselves we gain compatibility with all (fingers crossed) platforms Julia supports, compatibility with weird string ABIs and architectures, and dead simple installation.
Stan.jl
can also allow users to point to their own installation of course, if they have a customized version, but for most users automatically installing CmdStan_jll
is good enough. I'm going to try and apply the patch above, to see if we can get most of the way there by just patching the makefile on our end. TBB seems to be the biggest headache when simply building CmdStan, so I might work towards the dependency graph above, where I first build TBB, then Stanmath, etc.
If we can get cross compilation working for Julia, prebuilt binaries for all the supported platforms should just fall right out, which could also help you all provide binaries for less common platforms.
Rather than requiring users to specify a path to a CmdStan or stanc install
understood this is a total headache, but CmdStan shoulders the build burden so that the wrapper interfaces don't have to. for CmdStanR and CmdStanPy, we have tried very hard to co-ordinate the interfaces and the goal is to allow any interface to run from a single CmdStan installation by adopting the convention of a default CmdStan install location in the user's home directory, which can be overridden as needed. the default is ~/.cmdstan
. ($HOME$/.cmdstan) - although the CmdStanR folks have yet to adopt this, but they have agreed in priniciple this is what it should be.
Hi @Wimmerer ,
So right now we're redoing the install process for all CmdStan interfaces not just the StanJulia interface. We're creating a process where the CmdStan Interfaces don't deal with installation. The main point of doing this is to make the installation process for any CmdStan interface much less painless. These new processes will also impact how we deliver other packages like CmdStanARM packages to the user. In such being able to provide added Stan experiences to other programming languages like Python and Julia. If you would want to be part of that let me know and we can collaborate further. Let me know what you think.
Thanks,
-Brian
prebuilt binaries for all the supported platforms
I agree this is something that we need!!!! but this doesn't require cross-compilation - it's just that we only build on the platforms we support.
the Makefiles are very difficult to maintain - we've refactored them a bunch of times and there have been several attempts to use cmake. we've made a bunch of choices that have improved the compilation and runtime performance but have made the install/build process increasingly complex. therefore, my inclination is to push back against adding complexity to the makefiles so that they can do more when there are workable alternatives.
but this doesn't require cross-compilation
Sure, but with BinaryBuilder we have a single cross-compilation environment which can target 16 different platforms, including the recent aarch64-apple-darwin (aka Apple Silicon), instead of having 16 different build machines, some of them with non-common hardware (e.g. PowerPC)
I love the functionality that Binary Builder presents in terms of a single cross-compliation platform. In fact when talking to Stefan Karpinsky about it a few years he ago, he recommended it to me in regards to the StanJulia interface installation. Though, there are other things to think about in terms of what this would add to other CmdStan interfaces in terms of dependencies and other CmdStan adjacent packages. I don't know if they would feel comfortable having anything Julia related in their package ecosystem, not that it would be anything bad but I don't know how this play out with regards to environments people develop in and package management on their end. I realize that there are Julia packages that interface with other languages like RCall and PyCall but I'm not sure that I would want someone to deal with 3 languages in a single CmdStan package.
To be clear, I'm not going to advocate to get cross-compilation working: I just wanted to point out what are its advantages (single compilation environment for multiple platforms). It'd be great if it was possible to cross-compile cmdstan, but I understand your objectives.
Summary:
Allow setting a new variable,
TARGET_OS
, which is used instead ofOS
when it determines a property of the compiled files.Description:
The existing codebase assumes that the current OS is the same as the OS that will be running the compiled program. This is not always the case. In my case I am running macOS but running my Stan programs on Linux.
I'm including the patch I've come up with here, in case it's helpful to others.
Note that this patch assumes that you have also set a number of other environment variables to reasonable values. In my case that has meant setting:
CXX
CXXFLAGS_PROGRAM
CXXFLAGS_LANG
CXXFLAGS
CXXFLAGS_OPTIM_TBB
LDFLAGS_TBB
Current Version:
v2.26.1