ocaml / opam

opam is a source-based package manager. It supports multiple simultaneous compiler installations, flexible package constraints, and a Git-friendly development workflow.

https://opam.ocaml.org

Other

1.25k stars 363 forks source link

cross-compilation between switches #1536

Open samoht opened 10 years ago

samoht commented 10 years ago

I can't find the issue where we were discussing that before anymore (so maybe, the issue never actually existed and it was only by email), but at one point we wanted to be able to use the swtiches to do some kind of cross-compilation.

A possible use-case for that is https://github.com/ocaml/opam-repository/issues/2393

AltGr commented 9 years ago

there have been various ideas related to this, and they are starting to take shape, so I'll start writing some of them down.

a switch master may be configured with a "partner switch", with a label, say child. It should probably only exist within the master switch.
when loading the "master" switch, all packages of the partner switch are also added to the universe, properly qualified e.g. child#pkg.
the packages can then refer to dependencies from the other switch, and the installation is resolved as a whole.
variables from the other switch should also be available with proper qualification (master#lib)
installations are done with the environment defined as for the package's original switch, respecting the overall dependency order.
packages with other-switch dependencies could either be unavailable when that partner switch name isn't defined in the current setup, or better, default to the same switch.
that could result in normalised (build, host, target...) names for possible partner switches on the normal repositories, and, in time, cross-compilation aware packages properly qualifying all their dependencies, and referred to directories...

cc @lefessan, @dbuenzli, @UnixJunkie

How well would that fit for the purpose ? There would be somewhat large code changes needed in opam, but it seems doable; the tricky bit may be getting the initial switch setup.

dbuenzli commented 9 years ago

I think that as a first step, we could actually do something much simpler.

For me the goal with the support I'd like to have in assemblage is to have a system where the end-user should not really be concerned about cross compilation if he correctly tags the parts of its projects as being outcomes for building the project (e.g. subtools needed to build the project) or for "running" the project (resulting libraries, binaries and tests). I would also like that the resulting opam files (whose build: fields will be automatically derived) to not need tweaking to support cross-compilation so that the same opam files (and hence repository) are used in cross-compilation and non cross-compilation settings.

For this I basically just need to be able to make a distinction for each build tool between a version for the build-os and one for the host-os. So in some sense I only need more configuration variables that should be correctly set and provided by the build environment, i.e. by opam.

For example a build-os path and a host-os path. But I don't think I need to be able to refer to packages in the other switch at the opam level, for example package flag resolution would occur by using the ocamlfind that is in my build-os path for build system tools and host-os path for the final compilation outcomes.

Now @whitequark mentioned that in some cases you need build-os and host-os package to match version (e.g. pre-processor which have support libraries) so it may be nice to be able to somehow link the switches at a certain point to avoid the burden of having to sync both switches manually.

I also think that in general it would be nice to eventually be able to precisely refer to configuration variables from packages from one or the other switch or specify different package version for one or the other switch but I suspect that the actual need for this may be marginal.

So the only thing I'm saying here is that already a lot can be simply done by exposing more variables distinguishing build and host os in $opam/$SWITCH/config/global-config.config. and this wouldn't need large code changes in opam (heck it can even be done manually) while providing build system developers with the right information in order to support cross-compilation scenarios in a better way. We should just find the right names so that we don't have to break compatibility in the future when more evolved mechanisms as you suggest above are introduced in the system.

dbuenzli commented 9 years ago

@mato

dbuenzli commented 9 years ago

As I'm looking a little bit more into cross compilation right now I think that using multiple switches for solving cross compilation looks like a bad idea.

Rather I think that an opam switch should have a list of supported arch-os tuples in the switch. One of these tuples should be distinguished as being the build-os tuple. Installing a package in such a switch would compile it and install it for all the tuples (if a tuple is unsupported by a given package the package is deemed to be installed but its installation for the tuple is empty, we may need a notion of "is installed for that tuple" for solving constraints though).

Every arch-os tuple would define its own ".install" prefix for the packages to make it easy for build systems to target a tuple by simply defining an appropriate prefix. This would make the whole system much easier to understand and use. Especially it would scale to an arbitrary number of cross-compiled platforms you may want to be interested in and there's no need to keep switches synchronized or juggle with them (which is quite heavy in practice).

Maybe even MirageOS backends could then also be supported at the arch-os level rather than polluting ocamlfind package name hierarchy with .{xen,unix} suffixes.

whitequark commented 9 years ago

Yes, I agree wholeheartedly. This is essentially how opam-android already works.

samoht commented 9 years ago

dbuenzli commented 9 years ago

Tried to make the ideas of my last comment a little more precise in this (incomplete) proposal which seems better to discuss on the opam-devel list.

whitequark commented 9 years ago

I hate to be on lots of mailing lists so, unfortunate as it may be, I'll post my comments here:

Debian multilib hierarchies: don't put too much thought into this. Linux multilib is not well designed and there is no particular good reason for that hierarchy. It's more or less gcc's leaked implementation detail.
Having os implicitly be host-os: strongly disagree. The three variables are already confusing enough and being able to explicitly spell out the distinction is important.
Dependency semantics: I think it is important to avoid building packages for every know host architecture. First, what happens when you add one? Do you instantly build all the package you already had installed, and if that fails, atomically revert back? That sounds like a debugging nightmare. Second, it would just be confusing to have something like ethernet-mirage installed on Linux, and worse, it will break depopts that look for such an architecture-specific package. Third, it is not even hard to implement or understand: when you have a host arch ARCH enabled, a package pkg:ARCH that can be installed appears for every available package. There would be no interaction with the rest of opam after the initial seeding, based exclusively on the host-archs and arch-specific fields.
xlibexec: there is no good solution for that without substantial ocamlfind modification, and I don't even know offhand what would be a reasonable change to ocamlfind's semantics. In any way, do we even need xlibexec and xbin? Your proposal implies that the version of the package will have to match on all the architectures, so one could simply use build arch's bin or lib. This also means we get to reuse much more of existing buildsystems.
prefixed PKG:var: yes, definitely; at least because it is unambiguous and necessary for completeness. Though I cannot immediately come up with a use case, I'm sure some exist.
depext interaction: there is no good solution for this. Sometimes you can install a package for foreign architecture directly, e.g. an x86_64 system accomodates i386 packages. Sometimes a package would have a name prefixed, e.g. armel-linux-gnu-binutils. Sometimes a completely custom package name would be required, e.g. (hypothetically) if you deploy to certain microcontrollers, you would want to always install sdcc for whatever the build architecture it is.
notion of active architecture; pkg-config interaction: none of these are useful besides i386/x86_64 multilib, which is a very exceptional special case and is probably not worth introducing support for. Data point: it causes a great deal of pain to the clang team; I doubt implementing it properly in opam will be easy.

Overall I like this proposal, and even in its most conservative form (mandatory installing of a package on all known architectures) opam-android would benefit from it. Though I strongly prefer being able to install whatever packages I want for every architecture separately.

One thing I would like to have is procedurally defined architectures. E.g. there are hundreds of Android architecture names, defined by the product of (Android API level × Android subarchitecture), e.g. (android-17 × armv5t3) or (android-21 × mipsel). The Android API level is a system ABI name proper, even though the C toolchain does not include it in the triple (which is a mistake IMO) and it belongs in the architecture tuple; binaries are not in practice backwards-compatible across Android API levels. And it is in fact common to build a project for several CPU architectures (arm, x86 and mips are the usual three) as well as several API levels, perhaps to take advantage of a newly introduced API, especially if you have native code (in Java you can just use reflection). Without being able to define these procedurally, I would have to resort to environment variables during build, which is both fragile and an extremely bad user experience for building for a large amount of architectures.

AltGr commented 9 years ago

Overall I like this proposal, and even in its most conservative form (mandatory installing of a package on all known architectures) opam-android would benefit from it. Though I strongly prefer being able to install whatever packages I want for every architecture separately.

See my mail answer... both would be handled in almost the same way for solving, and having unprefixed packages for build and prefixed for other arches could be a possibility: it's mostly an UI choice. Note that in your case, this is, from a different angle, partly the same as allowing multi-switch packages.

Prefixing packages also solves the problem of package variables: ARCHPKG:var makes more sense to me than PKG:ARCH-var.

One more thought: might be a bit fragile... but following my solution for solving installations, we could skip the arch-specific: field and ignore packages unavailable for a given architecture: if any package depends on them on the same architecture, it will still fail, but it would otherwise succeed.

dbuenzli commented 9 years ago

Having os implicitly be host-os: strongly disagree. The three variables are already confusing enough and being able to explicitly spell out the distinction is important.

The reason here is mainly for compatibility with the existing metadata and if you want to keep that, then the correct semantics to give to these variables is to give them host semantics. I don't mind forbidding them but it means that everybody has to think about cross.

Dependency semantics: I think it is important to avoid building packages for every know host architecture. First, what happens when you add one? Do you instantly build all the package you already had installed, and if that fails, atomically revert back?

Debugging nightmare could be alleviated by having something akin to --keep-build-dir flag and some kind of --continue. That said it seems that you are not the only one who is concerned about this (@samoht, see the list).

Maybe the idea that all package should be installed on each architecture is a bad one, but as I said on the list I think it's good that the notion of this package is installed remains the same whether my switch is multiarch or not. But if everybody's against it I won't fight against it --- it was also motivated by trying to make that simpler to implement for @AltGr, it means that a lot of things don't have to change e.g. the opam info display etc., but of course ease of implementation should not take over usability.

In any way, do we even need xlibexec and xbin? Your proposal implies that the version of the package will have to match on all the architectures, so one could simply use build arch's bin or lib. This also means we get to reuse much more of existing buildsystems.

xbin at least, the idea here is that the ocaml cross compilers would be put there and as mentioned in the proposal they get in your PATH first. This avoid the "prefix binaries by architecture" trick and thus allows to reuse more existing build systems.

prefixed PKG:var: yes, definitely; at least because it is unambiguous and necessary for completeness. Though I cannot immediately come up with a use case, I'm sure some exist.

Well at least we need them for being able to specify package configuration variables which you definitively want to be able to specify per architecture.

One thing I would like to have is procedurally defined architectures.

Not sure exactly what you want here. What do you mean by define ?

adrien-n commented 9 years ago

Getting here fairly late: I was told about this discussion only earlier today.

For me the goal with the support I'd like to have in assemblage is to have a system where the end-user should not really be concerned about cross compilation if he correctly tags the parts of its projects as being outcomes for building the project (e.g. subtools needed to build the project) or for "running" the project (resulting libraries, binaries and tests).

The subject is slightly more complex however.

A common need is to build applications that will be used during the current build but also later on by other projects. In that case, the saner way is to keep things simple and ask for a native build to be available when you need to do a cross build. The only requirement for the build system is to be able to run the given tool from the system rather than locally (but still build it as it can be useful for the cross-compilation target too).

You might think about an approach with a smarter tooling that will automatially build what is needed for the right architecture. As a heavy packager, I hate that because it is way more complex and doesn't even save CPU cycles.

Debian multilib hierarchies: don't put too much thought into this. Linux multilib is not well designed and there is no particular good reason for that hierarchy. It's more or less gcc's leaked implementation detail.

Are you talking about lib, lib32 and lib64 or the current multiarch from Debian which puts sorts binaries under /usr/lib/$(target-triplet). Debian's multiarch seems to be working well enough in practice and doesn't seem to be limited to a specific compiler nor linker.

Having os implicitly be host-os: strongly disagree. The three variables are already confusing enough and being able to explicitly spell out the distinction is important.

The reason here is mainly for compatibility with the existing metadata and if you want to keep that, then the correct semantics to give to these variables is to give them host semantics. I don't mind forbidding them but it means that everybody has to think about cross.

Having that compatibility is basically sweeping issues under the rug. It is going to be difficult to have a feature work if you let its dev-users that it even exists. Not letting them write broken packaging because the syntax and grammar don't allow it is even better.

notion of active architecture; pkg-config interaction: none of these are useful besides i386/x86_64 multilib, which is a very exceptional special case and is probably not worth introducing support for. Data point: it causes a great deal of pain to the clang team; I doubt implementing it properly in opam will be easy.

I don't understand how pkg-config interaction can be not useful; I rely on it a lot for Windows stuff. I also rely on it for cross-compilation across architectures with Linux.

Moreover it means handling two environment variables: PKG_CONFIG_LIBDIR and PKG_CONFIG_PATH. The first one needs to point to the .pc files for your current build target and the second one needs to be emptied. It's difficult to make something simpler.

As for active architecture, I'm not sure: it's not something I would mind about a lot currently.

In any way, do we even need xlibexec and xbin? Your proposal implies that the version of the package will have to match on all the architectures, so one could simply use build arch's bin or lib. This also means we get to reuse much more of existing buildsystems.

xbin at least, the idea here is that the ocaml cross compilers would be put there and as mentioned in the proposal they get in your PATH first. This avoid the "prefix binaries by architecture" trick and thus allows to reuse more existing build systems.

I would instead provide prefixed-binaries as the default. It's also flexible and doesn't put any requirement about ordering of components in $PATH which is quite brittle. And if needed it is always possible to add un-prefixed binaries in a given directory. It woud be really detrimental to promote assumptions about $PATH in software.

As far as I'm concerned, I would use prefixed binaries and push the matter of chosing the right one down to ocamlfind.

For ocamlfind itself, I have been using environment variables to specify its configuration file quite happily. If I had a strong need to have a binary I could pass as CC, HOST_CC, TARGET_CC, or anything else, a shell script would do (simply re-invoke the regular executable but with $OCAMLFIND_CONF set).

whitequark commented 9 years ago

Debugging nightmare could be alleviated by having something akin to --keep-build-dir flag and some kind of --continue.

Handling that robustly would be just as complex as the solution I suggest, but the user experience is more contrived. And users will of course encounter errors, especially if there are any external to opam components involved at all (but even if not, system differences will take their toll).

One thing I would like to have is procedurally defined architectures.

Not sure exactly what you want here. What do you mean by define ?

Essentially what I want is for setting up an architecture to be lightweight, both to the user and the packager, since targets such as Android would be most elegantly supported with very finely grained architecture lists.

Are you talking about lib, lib32 and lib64 or the current multiarch from Debian which puts sorts binaries under /usr/lib/$(target-triplet).

Current multiarch. It might seem to work well enough but it has caused untold amounts of pain to the clang team, which makes me wary of copying it.

Having that compatibility is basically sweeping issues under the rug.

Agree. Furthermore, I've realized, with the migration capabilities present in opam, we can deprecate unprefixed os, add a warning, and then some time after that forbid it entirely, all without breaking compatibility with either current or previous versions of opam. So I am now even more in favor of not ultimately having the alias.

I don't understand how pkg-config interaction can be not useful

On reflection, scratch my earlier statement that it is not useful. What would even be the nature of such compatibility? I am not sure I can imagine what can opam do here.

I would instead provide prefixed-binaries as the default. It's also flexible and doesn't put any requirement about ordering of components in $PATH which is quite brittle. And if needed it is always possible to add un-prefixed binaries in a given directory.

I think all of this is besides the point. We already have a really great way of handling cross-compilation, that is, using ocamlfind -toolchain. It is completely agnostic of PATH and indeed any implicit lookup mechanism. The most I see opam doing is providing the same information as contained in ocamlfind toolchain files in some format that would be digestible by other buildsystems. Personally, I think using ocamlfind as a library is enough, but there may be valid objections to that.

If we go that path, it doesn't matter if the binaries are prefixed or not, and so the simplest solution could be used, which is to not change anything in the existing OCaml buildsystem and just pass -bindir.

dbuenzli commented 9 years ago

The subject is slightly more complex however.

A common need is to build applications that will be used during the current build but also later on by other projects. In that case, the saner way is to keep things simple and ask for a native build to be available when you need to do a cross build. The only requirement for the build system is to be able to run the given tool from the system rather than locally (but still build it as it can be useful for the cross-compilation target too).

It does seem more complex: can't make sense at all of what you say here. If you are afraid that some binary build artefacts will only be built for the build architecture while they could also be used for the host, what I described doesn't preclude to build both for the build and for the host architecture --- and that is the actual reason why a distinction is made between bin (host architecture binaries) and xbin (build architecture binaries) directories in the proposal.

Having that compatibility is basically sweeping issues under the rug. It is going to be difficult to have a feature work if you let its dev-users that it even exists. Not letting them write broken packaging because the syntax and grammar don't allow it is even better.

Well you need to think with migration in mind. Unless you have the time to fix the 1000 opam packages in a night. I don't have a strong opinion about that and l'll leave the final judgement to opam's maintainers.

I would instead provide prefixed-binaries as the default. It's also flexible and doesn't put any requirement about ordering of components in $PATH which is quite brittle. And if needed it is always possible to add un-prefixed binaries in a given directory. It woud be really detrimental to promote assumptions about $PATH in software.

I think that a good build system should take the values directly from build-bin and host-xbin as needed. What I would avoid though is that architectures "subswitches" start to write in the build architecture prefix, that's the reason for xbin, now if executables there should be prefixed by the architecture or not I don't really mind, except that if you do so it will for sure take much longer to migrate every package out there to cross.

As far as using ocamlfind is concerned, I think it's essential for now but we should also avoid having it too much in mind and see a little bit further (e.g. experimenting with this), we obviously have a little bit too many names and notion of packages in the whole eco-system which is both cumbersome and embarrasing.

@whitequark

Handling that robustly would be just as complex as the solution I suggest, but the user experience is more contrived. And users will of course encounter errors, especially if there are any external to opam components involved at all (but even if not, system differences will take their toll).

Yeah my assumption was maybe a little bit to simplistic, we should design so that build failures are not too annoying.

adrien-n commented 9 years ago

Sorry for the really long delay in replaying, I got sucked in a work-induced time-vortex. At least that got me to dabble with Debian's multiarch.

Are you talking about lib, lib32 and lib64 or the current multiarch from Debian which puts sorts binaries under /usr/lib/$(target-triplet).

Current multiarch. It might seem to work well enough but it has caused untold amounts of pain to the clang team, which makes me wary of copying it.

I haven't experienced these troubles but I've heard about the transition and early results. Clearly, multiarch has caused a lot of issues. However it was also something ambitious and it seems to be working well and easily now. In any case, I consider it something to evaluate; at the very least to understand its issues in order to not replicate them.

I don't understand how pkg-config interaction can be not useful

On reflection, scratch my earlier statement that it is not useful. What would even be the nature of such compatibility? I am not sure I can imagine what can opam do here.

Querying the "right" pkg-config for C bindings is the only interaction I can think of. That's quite naturally linked to being able to use the right C toolchain and is probably something to handle in the same place.

I think all of this is besides the point. We already have a really great way of handling cross-compilation, that is, using ocamlfind -toolchain. It is completely agnostic of PATH and indeed any implicit lookup mechanism. The most I see opam doing is providing the same information as contained in ocamlfind toolchain files in some format that would be digestible by other buildsystems. Personally, I think using ocamlfind as a library is enough, but there may be valid objections to that.

Agreed.

If we go that path, it doesn't matter if the binaries are prefixed or not, and so the simplest solution could be used, which is to not change anything in the existing OCaml buildsystem and just pass -bindir. [...] Well you need to think with migration in mind. Unless you have the time to fix the 1000 opam packages in a night. I don't have a strong opinion about that and l'll leave the final judgement to opam's maintainers.

As far as I know, the compiler will create prefixed binaries. Distributions (at least Linux ones) also have policies in place to require prefixing so we will see it in any case.

An issue that I was recently reminded of is "universal binaries": how can we make them without having two toolchains visible at the same time? Moreover, if we take compatibiity into account, some build systems (autotools for instace, but not only) expect prefixed binaries.

Note that I am assuming it will not be more work to add the prefix to calls to binaries.

I would prefer that prefixed binaries are provided by default and unprefixed symlinks are also put in place where you would have put the binaries themselves (i.e. visible or not). It doesn't force one use or another but should be simpler in the short and long term. Short-term it gives compatibility. Medium-term it allows migration. Long-term it makes it possible to remove the symlinks (you could also make the unprefixed binaries wrappers that warn and log their use).

To be clear: I don't see a migration in less than two years (and that's being generous) and I therefore definitely believe we also need something simple for quick and simple build fixes even if temporary (until build systems are updated).

It does seem more complex: can't make sense at all of what you say here. If you are afraid that some binary build artefacts will only be built for the build architecture while they could also be used for the host, what I described doesn't preclude to build both for the build and for the host architecture --- and that is the actual reason why a distinction is made between bin (host architecture binaries) and xbin (build architecture binaries) directories in the proposal.

This is a bit tangential to the current discussion and is really something more for Assemblage rather than opam.

Let me quote more finely (note the emphasis):

if he correctly tags the parts of its projects as being outcomes for building the project (e.g. subtools needed to build the project) OR for "running" the project (resulting libraries, binaries and tests).

I am arguing about the "OR" which I understand as a "XOR" and therefore only two possible choices.

I believe that describing executables as "to run on $build" xor "to run on $host" plus "to install" xor "to not install" is insufficient and that you need something that expresses whether the tool will also be used as a development tool later on.

Let me illustrate my point. Take libgtk+2 as an example (all GUI toolkits do the same). It generates resources during its build for use at runtime. This resource generator (let's call it "resource-generator") is built as part of the build process of libgtk+2. Generating resources is also something that applications using GTK+2 libraries do and the tool has to be installed on the system.

Some developers prefer to always build "resource-generator" for $build and invoke the freshly-built binary when the build process needs that generator. I instead argue for always building "resource-generator" for $host and be able to either call the freshly-built one if $host = $build or one already installed on the system if not.

There are also people who make build systems build two versions at the same time: one for $build and one for $host. Obviously this makes things really complex (I would actually say "horrible"), especially when your tool depends on the library being built.

For one-time builders, the first approach is supposed to save some CPU, disk and bandwidth but these are ridiculously cheap.

I find the second approach really simpler to implement and to package. Moreover the "resource-generator" it provides for $host is particularly useful when cross-building development environment (which I do for Windows but that can apply to other systems as well like ARM boards running under Linux).

And I think Assemblage needs an easy (or easy enough) way to express that.

lefessan commented 9 years ago

I don't see the point of this discussing this new proposal. It is less general, less powerful, and as complex to implement as the proposal for multi-switch constraints ( http://ocaml.org/meetings/ocaml/2014/ocaml2014_12.pdf ). Let me remind you what it is about:

With the multi-switch constraints, you can have two switches, one for the "build", one for the "host". Now, suppose you want to cross-compile some package A: A will say that it depends on host:B and build:C, meaning that OPAM will also need to cross-compile B in the host switch, and to just compile C in the build switch. A package does not need to be compiled twice if it is just needed in one of the two switches.

Pietro has started to implement multi-switch constraints. There are other changes needed: the "host" switch needs to be configured to use the "build" switch, and to provide any OPAM package in the "host" switch with the path to the "build" switch, so that they can find the commands they need. Any dependency that has no "host:" or "build:" prefix is supposed to be "build:" (i.e. no cross-compilation). In a switch without cross-compilation, build=switch, so dependencies towards "build" are just dependencies to the current switch.

This is quite simple to implement, a few days of work. I would do it myself if I was not involved in too many projects. Moreover, it solves also other problems, that are not related to cross-compilation (like having multiple "coq" switches with Coq packages for different versions of Coq depending on a single "ocaml" switch where OCaml packages are installed).

dra27 commented 9 years ago

Briefly, as I keep being sucked into time-swallowing vortices too. Daniel's proposal has benefits for native-Windows OPAM in that I think there's a definite case to being operate all 4 (or possibly even all 6) Windows ports of OCaml in a single switch. The case for being able to have both x86_64 and i686 available in a single switch, regardless of OS, also strikes me as potentially more useful than having to maintain two switches to achieve that? Mindful of both the Perl's and Git's dangers of having myriad methods for doing one thing, I also wonder if the two proposals have to be mutually exclusive... One possibly useless thought on the complexity (because I haven't fully digested the proposal yet): would some of the worries about what happens when you add/remove an arch from a switch (failed packages, some packages on some archs, etc.) be eliminated by not permitting that. i.e. the available architectures are specified only when the switch is created and changing them involves blowing away the switch? opam already has facilities to allow copying over the packages installed and so forth and for a given switch, packages are only available if they can be built for all archs?

dbuenzli commented 9 years ago

I don't see the point of this discussing this new proposal.

Right it's always better not to discuss things.

It is less general, less powerful, and as complex to implement as the proposal for multi-switch constraints

I would also add that it feels much less usable from and end user point of view. You are failing in the old design trap "more powerful is better".

I don't think that your proposal can scale to manage many cross-compiled platforms in a pleasant way for the end user. Besides it is still rooted in the old view that compilers are associated to switches and thus fails to see the future of compiler as packages. It also needlessly change the notion of opam switch which are now treated as independent package universes and should remain so in my opinion as it exposes a good and well understood isolation property and operational model for the end user. Finally it glosses over many details on how you actually provide the build environment for enabling cross-compilation.

Also note that what it seems to propose is actually not far of what this proposes w.r.t. to the solver if we lift the notion from this proposal that every package should be installed at the same time for each architecture in the switch --- which as we already discussed seems a bad idea. It's just that package prefixes are not switches but architecture and happen in a single switch.

AltGr commented 9 years ago

Well, at least everybody agrees on the core of the idea -- despite the disputes above, there are more similarities than differences between the two proposals. The rest are mostly layout and user interface "details" -- I tend to think that a more general engine is better, while the interface can be more convenient and easy to use if reasonably fenced.

Fabrice's proposal solves more of the package dependencies and solver query issues, while Daniel's one gives a good story on handling variables and packages configuration and build. After that, it's a matter of how one sets the cross-compilation switches, how cross-arch dependencies are handled, and how we handle package selection across them. These parts are still worth discussing IMHO.

lefessan commented 9 years ago

So, here is a more detailed specification of multi-switch constraints for cross-compilation. Comments welcome.

OPAM Cross-Compilation Support Specification

The basic idea behind cross-compilation support in OPAM is that cross-compiled packages should be able to depend on packages that are not cross-compiled, and, for that, on another switch than the current switch.

This is done by doing:

opam switch 4.03.0-mingw64 --alias 4.03.0 --msc build=4.03.0

Here, OPAM will create the switch 4.03.0-mingw64, as an alias of 4.03.0, and configure it to use a Multi-Switch Constraint (MSC) that tells it that the build switch variable is the switch 4.03.0 (OPAM should fail if it does not currently exist). Note that if you don't provide a value for build, then it defaults to the current switch, as any other such undefined switch variable.

All packages in the new switch will be compiled in the following environment:

All dependencies towards packages starting with build: will be resolved as belonging to the corresponding switch. For example, if some package depends on "build:lwt", then OPAM will add a dependency towards package lwt in the 4.03.0 switch. A package can define a dependency that will only exists if a multi-switch constraint exists:

  depends: [
     "build:lwt" { msc-build && >= "2.3" }
     "lwt" { ! msc-build && >= "2.4" }
     "build:async" { msc-build && msc-build-ocaml-version >= "4.04.0" }
  ]

As shown in this examples, constraints can also be put on the OCaml version in the other switch.

An OPAM_MSC_BUILD_NAME environment variable is set to 4.03.0
An OPAM_MSC_BUILD_DIR environment variable is set to directory for the "build" switch ($HOME/.opam/4.03.0 here by default)
- The following substitutions are available in package commands:

"%{msc-build:enable}%" becomes "enable" (would be "disable" otherwise) "%{msc-build:dir}%" becomes "$HOME/.opam/4.03.0" (would be "$HOME/.opam/4.03.0-mingw64" otherwise) "%{msc-build:name}%" becomes "4.03.0" (would be "4.03.0-mingw64" otherwise)

It is the work of package maintainers/developers to use these informations to correctly build their packages in that environment.

Packages should never try to modify the other switches contents.

whitequark commented 9 years ago

I have a weak preference towards this proposal because I already know it will work for me, with one exception. I would like to upstream cross-compilation instructions, and there should be some way to make the same definition work both when cross-compiling or not. Perhaps an extension of the opam language.

dbuenzli commented 9 years ago

@lefessan > Here, OPAM will create the switch 4.03.0-mingw64, as an alias of 4.03.0, and configure it to use a Multi-Switch Constraint (MSC) that

Could you clarify what happens on opam install and opam remove in such a switch and in the other.

@whitequark > I have a weak preference towards this proposal because I already know it will work for me, with one exception.

I think that maintaining one switch per architecture you want to compile to is going to be a huge pain in the ass and we'll end up adding commands to opam itself to be able to manage these switches in a convenient way which is suspicious.

Having switches where you have multiple architectures and with the constraint that in each architecture, if a package is installed, it is installed with the same version as in the other architectures may be less powerful but will solve 99% of the cases with a better user experience (the different version problem could be solved if once opam allows to install multiple version of a package). The only caveat of this however is that indeed you will need to install any package you want on host on build aswell. However for me it is the only way to solve this:

and there should be some way to make the same definition work both when cross-compiling or not.

without having to bother everyone out there with cross.

Also @lefessan's proposal, does not cleanly separate host binaries and build binaries which I consider important for deployment reasons. Though this could certainly be added to his proposal aswell.

whitequark commented 9 years ago

@dbuenzli All of this is true. Note how I did not say "I think it is a good idea", only that I prefer it because I'm lazy.

lefessan commented 9 years ago

@dbuenzli > Could you clarify what happens on opam install and opam remove

opam install pkg will install pkg in the current switch, and any dependency in the switch and in the other switches. opam remove lwt will remove lwt from the current switch, and all the packages, in the current switch and other switches that depend on that switch, that depend on that package. For that, the solver receives not only a universe containing the current switch packages, but all the packages for all switches. This is already implemented in Pietro Abate's version.

opam itself to be able to manage these switches in a convenient way which is suspicious.

I don't see why. Here, it looks like what you want is to have some synchronisation for multiple switches (for cross-compilation, but maybe also for other reasons). We can imagine to define a "meta switch", containing multiple real switches:

opam switch 4.03.0-all --pack 4.03.0 4.03.0-mingw64 4.03.0-android
opam install --switch 4.03.0-all lwt.2.5

Note that Multi-Switch Constraints allows you to do that, since you can send the content of multiple switches to the solver in one command, in which case the solver will either find a solution, or abort the command.

Also @lefessan's proposal, does not cleanly separate host binaries and build binaries

My proposal only gives the packages the information needed to do what they need for cross-compilation. It is the package maintainers problem to use that information. For example, an OCaml cross-compiler might decide to store its own binary (on make install) in a different directory in the switch (i.e. xbin instead of bin), and packages would then, when cross-compiling, look for the cross-compiler in that exact location.

The catch is that someone could want to use such a setting to cross-compile a compiler, in which case the 4.03.0-mingw64 would have to contain two "ocaml compiler" packages, which is currently impossible. A solution would be that the cross-compiled compiler would have a different name from the cross-compiler.

dra27 commented 9 years ago

Sorry - my iPhone completely garbled that last reply!

opam itself to be able to manage these switches in a convenient way which is suspicious.

I don't see why. Here, it looks like what you want is to have some synchronisation for multiple switches (for cross-compilation, but maybe also for other reasons). We can imagine to define a "meta switch", containing multiple real switches:
opam switch 4.03.0-all --pack 4.03.0 4.03.0-mingw64 4.03.0-android
opam install --switch 4.03.0-all lwt.2.5

If I understand correctly, in the "meta switch", opam install would mean install lwt 2.5 to all the 3 'packed' switches. But what would effectively be installed in 4.03.0-all? Nothing?

dbuenzli commented 9 years ago

opam install pkg will install pkg in the current switch, and any dependency in the switch and in the other switches. opam remove lwt will remove lwt from the current switch, and all the packages, in the current switch and other switches that depend on that switch, that depend on that package.

My concern with this is that for the package developers it is much easier to cross-compile your package if they can simply assume the same package is present on both host and build with the same version. Also to handle the case @adrien-n mentions were you may want to compile a build binary both for the host and build platform.

If you are working on the assumption that both host and build package are present you can let package build system simply dispatch the use of one or the other according to metadata proper to the build system whenever host <> build. I highly suspect that requiring a different version of a package for build and host is never going to be useful. At least it can't be for now since it means that your package can't install when build = host as multiple version of the same package is currently not supported.

So the raw effect on opam metadata with your proposal is that most packages (since most of them use OCaml in their build system) will have to declare duplicated build:* dependencies because they also need it for their build system. So most of them will need e.g:

build:ocaml {>= "4.01.0" }
ocaml {>= "4.01.0"}

We can imagine to define a "meta switch", containing multiple real switches:

This would certainly make it more usable. However you also need to define what happens when you do an eval $(opam config eval) and for this the meta switch should be able to distinguish a build switch. I would actually prefer if meta switch would simply define a build for all companion switch you mention on the line which would eventually reduce to the non-msc proposal.

I think that with this the msc and "meta switch" as the potential of being less distruptive and easier to use to the end user command line wise. But we should also evaluate it's impact on the usability of opam metadata as mentioned above. One thing you'd also miss w.r.t. to the non-msc proposal is to be able to refer to arbitrary other architectures rather than just the build one but I personally have no use for that.

dbuenzli commented 9 years ago

@dra27> If I understand correctly, in the "meta switch", opam install would mean install lwt 2.5 to all the 3 'packed' switches. But what would effectively be installed in 4.03.0-all? Nothing?

Maybe "meta switch" is a bad name and we should rather see that as "dispatching" switch that has no existence in itself, it just makes the obvious shell script loop for you.

lefessan commented 9 years ago

4.03.0-all would not be an existing switch, it is an alias to say "all the switches specified after --pack", so nothing would be installed there.

I imagine you could even specify the kind of synchronization between the switches: --sync strong|weak would provide two modes: in strong synchronization mode, OPAM would add constraints on the packages, so that every package version would conflict with any other version of the same package in all the switches of the meta switch, whereas in weak mode, the conflicts would be local, i.e. as actually. As a consequence, in strong mode, all the switches would have exactly the same versions of all packages, whereas in weak mode, only the packages installed by the user would have the same versions, but not their dependencies.

lefessan commented 9 years ago

So the raw effect on opam metadata with your proposal is that most packages (since most of them use OCaml in their build system) will have to declare duplicated build:* dependencies because they also need it for their build system.

You probably want to have the same OCaml version in both the switches, but otherwise, most packages only depend on libraries from the "host" switch, and binaries from the "build" switch. For libraries, I don't see any need to depend on libraries from the "build" switch. You might, however, need to synchronize package versions between the "host" and "build" switch when they contain both binaries and libraries (i.e. the code generated by the binaries depends on the libraries at runtime, so they should have the same version). I suggest adding a self-depend: true flag for such packages in their opam file, to tell OPAM that it should introduce a constraint that, if the package is installed in the build switch, it should be installed in the host switch with exactly the same version.

It would also be nice to introduce support for cross-compilation in ocamlfind and other popular tools. For example:

$OPAM_MSC_BUILD_DIR/bin/ocamlfind ocamlc -cross $OPAM_MSC_HOST_DIR/lib ... could be used to access the packages available in the host switch, instead of the ones in the build switch, where is is probably located.

dbuenzli commented 9 years ago

It would also be nice to introduce support for cross-compilation in ocamlfind and other popular tools.

This already exists through the ocamlfind -toolchain option, see e.g. how this is handled in the opam-android project.

AltGr commented 8 years ago

If anyone on this thread wants to reply to #2476, I'd be glad to have other opinions on it!