jordwalke / esy-issues

Easy ISSUES
MIT License
49 stars 5 forks source link

Environment for cross compiling packages #88

Open wokalski opened 7 years ago

wokalski commented 7 years ago

These are somewhat messy notes about cross compiling. Please note that I'm not fully confident about anything that's written below. Cross compiling in OCaml is a complex issue for several reasons.

  1. opam packages have arbitrary, platform dependent build scripts. Usually made to work on hosts.
  2. You need to use different compiler versions for different platforms and architectures
  3. You need to manage build artefacts for different platforms

What's more, details will also differ depending on the host platform. The most obvious example that comes to my mind are fat binaries. On Mac we have wonderful support for them but they don't exist on other platforms. So in the Mac case, handling build artefacts could've been a bit easier. However, we cannot make any assumptions about platforms and the intricacies of targets make this task even harder.

Please note that I'm going to use word target (instead of architecture) because the product of compilation might be a fat binary containing slices for multiple architectures.

Workflow

For cross compilation we are going to need additional argument to esy build. What it does depends on the implementation but the task is on the intersection of package management, sandboxing, and building. Both build and runtime dependencies are different depending on the target.

Building opam packages

Existing packages use arbitrary commands as the build phase. I've skimmed through some packages and they use variety of tools. I don't want to claim which ones are (not) popular, but I have seen the following:

Those tools in turn use ocaml*, C compiler, and probably other tools I'm not aware of.

What I've written below is hard to believe to me since OCaml is highly portable. Therefore take it with a grain of salt.

Populating env vars

If we assume that most packages use just certain tools, let's say OCaml toolchain and C compilers we might try to create the right environment for a target. I have very low confidence in it though. I skimmed over opam-cross-ios and there are packages which need certain environment vars, but most of them also require changes to build commands.

To sum up, it's going to be hard to make it work. I would be very interested to hear an opinion from someone with good overview. If the premise that all those tools use similar setup behind the scenes is true, building a shell around ocaml* and C compiler might be an interesting solution. The scripts would forward commands from the build to concrete implementations with some adjustments to parameters and with right environment.

Package overrides

Hand written package overrides is a logical solution for those problems. opam-cross-ios is a working example of it. The changes to the packages usually consist of tweaks to build commands and sometimes patches to enable/disable target specific configs (in iOS case these are usually shared libraries). The pain here is that there's no easy way. Patching a package to a target is not very beginner friendly and time consuming.

Emulation

@jordwalke suggested emulation as a possible solution. Technically it could be possible. However, the cost of emulator, which is (afaik) very slow, eliminates the merits of native. We might as well compile it down to js and run in V8.

Structured build tool

Provided that what I've written above is true another solution might be making esy work with a particular build tool. Reason toolchain is great because it's opinionated. What I feel is needed for cross compilation support is something more predictable than potentially as many ways to build something as packages there are. Of course, reinventing the wheel is the easiest thing to do, so treat this as a frivolous thought. @bsansouci started work on bsb native which might be a fit.


There's a whole new category of problems and necessary consideration if we go beyond OCaml, but it's out of scope for this issue.

Again, it's hard to believe for me, that it's actually the way I described it so I will be delighted if I'm proven wrong.

All questions and thoughts are more than welcome.

wokalski commented 7 years ago

I realise that this write up is missing detail but the high level problems made it hard to make it super concrete.

jordwalke commented 7 years ago

Thank you for the research. I will comment with some thoughts:

@jordwalke suggested emulation as a possible solution. Technically it could be possible. However, the cost of emulator, which is (afaik) very slow, eliminates the merits of native. We might as well compile it down to js and run in V8.

I would not suggest emulation for the actual resulting artifacts, but rather just for running the build which generates the final artifacts. We would then run those artifacts natively. This would be like running esy build *on your iPhone` to produce an ARM/iOS executable. Except we don't actually want to do that, so perhaps instead we can run it inside of an emulator on our powerful desktop computers, just to get that resulting ARM binary which can run on device at full speed. What it effectively does, is simplifies all the problems that occur when runtime architecture/platform !== build time architecture/platform. I don't know if this is viable, do you?

jordwalke commented 7 years ago

Populating env vars

One approach I had suggested in the past, is to create a wrapper of the ocaml package, with a dummy ocamlopt, which looks for env vars $arch, and uses it to switch between multiple implementations. For this we need a different kind of environment variable propagation (downward). @wokalski pointed out that this might be handy, but not enough since packages need to be forked sometimes for platforms, and we need to say "when building for archX, use this package instead". So one solution for overriding packages per architecture is the proposed "sandbox package overrides". My original idea of env var based switching is too simplistic.

wokalski commented 7 years ago

@jordwalke re emulation. Yes, it does simplify a bit. I can see three potential issues:

  1. Emulator needs to be able to emulate platform X.
  2. All dependencies of the whole build system need to be compatible with emulated host
  3. Forking/patching might still be needed when a package makes assumptions about host. Like shared libs and iOS in the example.

This is a very interesting solution in general, but it's not interesting in my context. By the time I make Apple ARM + iOS emulate on qemu (or something else) I would've forked all opam packages there are 😛.

wokalski commented 7 years ago

https://github.com/cgreenhalgh/ocaml-crosscompiling https://github.com/ocaml/opam/issues/1536 https://github.com/ocaml/opam/issues/2476

jordwalke commented 7 years ago

I think #2 is the most challenging:

All dependencies of the whole build system need to be compatible with emulated host

Some depenencies are really just build time dependencies (like Make for example), and might not work on the emulated architecture - yet with the full emulation approach, you instantly make it so you cannot use them.

I think we're on the same page, and we both see that having the build process allow platform/target specific package overrides is the only way to go if we really want this to be seamless - esy build and you're done.

wokalski commented 7 years ago

Yes, I think so. If a package works without changes (from what I've read above it's possible for some packages), great, if not override a platform specific package.