Closed dbuenzli closed 4 years ago
One of the good aspect of the new configuration system is that by definition it mostly forces the executable of all build actions to be redefinable through the command line at configuration time. Builtin configuration keys for specifying build/host OS and architecture are also available for use.
This doesn't seem to be good enough since we have a single key for these executables. In practice we may need two: part of the build system may need the build platform OCaml toolchain because for example the build system has OCaml programs that generate code and need to be run on the build platform. See for example @whitequark's port of tgls
here.
The hack there is that instead of the single shot ocamlbuild invocation used in a regular build, he first builds the program that generates the bindings that will be invoked by the build system itself. This first build uses the build-os toolchain. It then proceeds by invoking the regular package build procedure (in which ocamlbuild will not rebuild the program that generates the bindings but have it at hand to generate the bindings) using the host-os toolchain through an environment variable and ocamlfind
.
I would rather avoid having to use ocamlfind
(which assemblage mainly sees as a source of information to build command line fragments, see #146) and environment variables for this. We should try to devise a scheme that allows to specify build-os
and host-os
values for utilities (both utilities run on the build-os but may have different outputs). By enriching or interpreting the usage type for a part we could then automatically select the right executable to invoke for the task at hand in the part's actions.
Then there is the issue of having build-os package and host-os packages since opam
doesn't allow to install more than one version of a package, I think build-os packages could live in another switch and the build-os ocamlfind
would simply direct you to the paths there. Finally there's the issue of actually getting this information from the build environment, most of the time represented by opam
. It should be noted that this would e.g. double all the variables like native
, native-dynlink
etc. So this should definitively be designed in hand with a good cross compilation story for opam (@Altgr, @avsm, @samoht).
It seems that when you start with the build/host os distinction, it creeps everywhere, so we should be sure to make it easy for users not to really have to care about that and that the correct things® are being done without them needing to be too aware about the details except correct part usage tagging.
Brain dumping a bit here.
Also do I have my terminology right ? It seems that autoconf
uses host
for what I call target
.
UPDATE changed terminology in the above posts.
Then there is the issue of having host package and target packages since opam doesn't allow to install more than one version of a package, I think host packages could live in another switch and the host ocamlfind would simply direct you to the paths there. Finally there's the issue of actually getting this information from the build environment, most of the time represented by opam. It should be noted that this would e.g. double all the variables like native, native-dynlink etc. So this should definitively be designed in hand with a good cross compilation story for opam (@Altgr, @avsm, @samoht).
I think it's a good idea to rely/extend on opam switches to deal with cross-compilation in general.
@samoht opam switches don't really work for cross-compilation. For example, you will often want to run an identical version of the package on the build and host system.
@dbuenzli autoconf uses "build" for what you use "host", and "host" for what you use "target".
@samoht opam switches don't really work for cross-compilation. For example, you will often want to run an identical version of the package on the build and host system.
Didn't get that. What does running a package mean ? Do you have an example ?
@dbuenzli autoconf uses "build" for what you use "host", and "host" for what you use "target".
Yes. Do you think the proposed terminology is problematic ? I'm not trying to be special. While their "build" is clear I find their "host" less obvious than "target" (maybe because of the confusion with hosts in VMs). Should we maybe switch the terminology to build-os
and target-os
rather than host-os
and target-os
? OTOH build
is keyword that already happens everywhere in a build system, so it makes discussions less clear in my opinion, for example we can talk about the host toolchain without this being ambiguous, build toolchain wouldn't be as obvious, you'd need to say the build-os toolchain.
No, "host toolchain" is still ambiguous because of https://en.wikipedia.org/wiki/Cross_compiler#Canadian_Cross. I suggest sticking to the autoconf terminology.
I mean, let's say, ppx_deriving or even sexplib. You have a build component (the ppx, or the camlp4 plugin) and a host component, which must have matching versions.
No, "host toolchain" is still ambiguous
Ok so I'll move to build-os
and target-os
. I'd rather avoid using host-os
for what is now target-os
I think it will confuse users in general, especially say in a mirageos setting.
You have a build component (the ppx, or the camlp4 plugin) and a host component, which must have matching versions.
Damned. See why I hate pre-processors.
(target-os) No, this is actually worse than the previous variant. target means the system for which a toolchain emits code... which you really should get right when you're talking about cross-compiling.
(preprocessors) You do realize that some of your packages do effectively the same thing, right? For example, tgls...
(target-os) No, this is actually worse than the previous variant. target means the system for which a toolchain emits code... which you really should get right when you're talking about cross-compiling.
Ok so correct terminology shall be used and propagated (even though it hasn't tricked in my head yet).
You do realize that some of your packages do effectively the same thing, right? For example, tgls...
To be precise no, for now tgls generates code at distribution time, your version of tgls does that...
Except for the ones that use js_of_ocaml
none of my packages do use pre-processors. A bunch of these do generate code at distribution time which but this is very different from pre-processing.
Besides it's not that I will not ever use a pre-processor, but I still hate them and think that most of the time they are wrong solutions to a real problems that should be solved at the language level by having meta-programming facilities as an integral part of the programming language itself.
Well, it's not like I want to (https://github.com/dbuenzli/tgls/issues/12), but point taken.
I don't disagree, but getting rid of all preprocessors is an unrealistic goal. Even if we fix OCaml completely, there are also e.g. packages which invoke protoc or similar tools.
I don't disagree, but getting rid of all preprocessors is an unrealistic goal.
Sure.
(target-os) No, this is actually worse than the previous variant. target means the system for which a toolchain emits code... which you really should get right when you're talking about cross-compiling.
So it seems that the whole OCaml toolchain is using the wrong terminology e.g. here and the host
and target
fields of ocamlc -config
. Should we really use a different terminology from the one of the OCaml compilers ?
Let's reiterate:
--build
: the architecture of the build machine--host
: the architecture that you want the compiler output to run on--target
: the architecture for which the compiler will generate codeIt seems like OCaml is using the terms correctly here; it assumes though that build
and host
are always the same. There actually should be no changes related to semantics of target
as it already does what it should; the cross-compiling-related changes will only decouple build
from host
.
Ok then so most of the time host
is going to be equal to target
I guess. But then isn't the terminology wrong in that PR (or maybe I'm just confused) ?
Yes, right now almost all OCaml builds have host
equal to target
equal to build
.
Yes, except it's worse than that: "host compiler" is not something that has a precise meaning. You have to list both host
and target
to meaningfully describe a compiler (whereas build
is just an environment detail.)
Ok thanks.
For example, you will often want to run an identical version of the package on the build and host system. [...] You have a build component (the ppx, or the camlp4 plugin) and a host component, which must have matching versions.
@whitequark This seems like an artefact of broken build systems. Formally in the package the build components should be compiled with both the build-os
and host-os
toolchain (the latter if you want to be able to do binary distributions of packages) and the host components should only be compiled with the host-os
toolchain. So in a better build world this is not a real argument --- of course it's easy to make the point that this world doesn't exist. Would you see another argument for the need of build-os
and host-os
package version sync ?
(I'm still convinced that one switch per architecture is a bad idea if you want to scale, but I'm trying to write a proposal for opam multiarch support in a single switch and found out that the "need same version" of the package in the build-os and host-os doesn't seem to hold water).
The fundamental point here is that the toolchain targeting build-os
must be also available to the packages being built in the switch targeting host-os
. It is not particularly important how that happens.
Additionally, this is as much a problem with build systems as installation; you need to build (some parts of) a package using a compiler targeting build-os
and then install them into a place where the host-os
toolchain would expect to find them. By far the easiest way to achieve this is the same-version requirement; anything else would require a huge amount of work for benefit that is unclear to me.
Given the current state of the OCaml toolchain I think the we have that mostly right.
One of the good aspect of the new configuration system is that by definition it mostly forces the executable of all build actions to be redefinable through the command line at configuration time. Builtin configuration keys for specifying build/host OS and architecture are also available for use.
Assemblage should also be careful about its use of the OCaml toolchain for its own purposes. Since we are using compiler libs this shouldn't be much of a problem. There was however an
ocamlfind
use for auto-loading the assemblage library which can now be overriden by theASSEMBLAGE_OCAMLFIND
environment variable (not to be confused with theocamlfind
configuration key used to lookup project's package dependencies). That variable could be removed if we move to a dynlink setting for assemblage rather than toploop.