Open kwshi opened 5 years ago
FWIW, because of this prohibition, I end up using cat
on several files to produce the file to give to dune.
My understanding is that the restriction on #use
is on purpose precisely to discourage building complex build systems based on OCaml syntax. OCaml syntax has several downsides, e.g it doesn't work well with incremental builds (it needs to be re-evaluated in every build).
On the other hand, you can often use the (include ...)
stanza instead of OCaml syntax to include generated dune files, and it is perfectly allowable to share code among the generators of such dune files. This works quite well in my experience.
In fact, you can see a good example of this technique in the dune codebase itself: the main test dune file https://github.com/ocaml/dune/blob/master/test/blackbox-tests/dune includes a generated dune file https://github.com/ocaml/dune/blob/master/test/blackbox-tests/dune.inc, and the generator https://github.com/ocaml/dune/blob/master/test/blackbox-tests/gen_tests.ml uses a library to generate sexp.
Hope that helps!
@nojb Thanks for the tips!
On the other hand, you can often use the
(include ...)
stanza instead of OCaml syntax to include generated dune files, and it is perfectly allowable to share code among the generators of such dune files. This works quite well in my experience.
This is nice, but doesn't this require the generated file (the argument to include
) to already exist before the dune file is read? My setup requires something like this:
dune build @test-xml
-> reads dune file
-> dune sees rule that calls `gen.ml` to generate `dune-gen`
-> dune calls `gen.ml` to generate `dune-gen`
-> `dune-gen` is imported as part of the dune file
You can see why that might not work--by the time dune-gen
is produced, the dune file has already been read, and it's not going to be re-read to include the file! That is, the actual mechanics of include
are
dune build @test-xml
-> dune sees `include dune-gen` stanza
-> dune imports `dune-gen`, which must already exist
The official docs also state "Currently, the included file cannot be generated and must be present in the source tree."
Of course, I can just manually call my generating script each time before running dune build
, but that almost defeats the purpose of generating things to begin with!
The example you provided from blackbox-tests
fits a different use case, in which it's testing a generated file against a pre-existing file; my use case is to use the generated dune file in the same file during the build process.
Of course, I can just manually call my generating script each time before running dune build, but that almost defeats the purpose of generating things to begin with!
If one commits the generated file, then one can combine the (include ...)
stanza with (mode promote)
in the rule that generates the dune file in question to automatically keep it up-to-date. Does it make sense?
That seems to make sense. Though it seems that, if I do that, the generated include
target will always be one "iteration" behind the call? i.e., if I do something like
dune-gen
(v1)bbdune build @test
with include
and promote
, etc.bbdune build @test
againWhat happens (so it seems to me) is this:
dune-gen
v1 is createddune-gen
at v1dune-gen
v1, and then runs rule+promote to replace dune-gen
with v2 but does not include v2dune-gen
at v2dune-gen
v2, runs script to replace dune-gen
with v3 but does not include v3Consequently, I'd have to always run dune twice after each change to get the latest version: once to generate the latest copy, and once again to actually load it.
Furthermore, I'd argue that committing the generated file might not be the best idea: a rule of thumb that I follow is that handwritten files are committed, and auto-generated files are ignored (exceptions: version lockfiles, etc.) Namely, if both generated and generator files are committed, then there is room for inconsistency: what if a project newbie modifies the generated file accidentally, expecting things to change, only to get the file overwritten? What happens when different versions of the generator script and generated file are committed (namely, what does that mean for version control)? What if one forgets to git add
one of the files when a change is made--then there are two commits for a single change? These aren't critical drawbacks, of course, but it makes this solution seem not quite as elegant as I would've hoped...
In general adding too much staging to obtain the project description and in particular global names such as library names is not great and that's one of the reason why we provide a static DSL rather than an OCaml API. However, the case of tests comes back quite often and it's not so important for Dune to know all the testsuite structure beforehand. I have a feeling that something better than be done here. Additionally, following recent work we have done on the Core of Dune it is now much easier to support running arbitrary programs to collect such rules.
In particular, it shouldn't be too hard to add support for (include <file>)
where <file>
is a generated file that is not committed, as long as <file>
doesn't declare global things such as libraries.
BTW, the OCaml syntax was introduced as an escape hatch to help porting projects to Dune. Several times we have added core features to Dune to formalise things that were done in Ad-Hoc ways via dune files in OCaml syntax. From our point of view, the less it is used the better it is. That's why the power given in these files is voluntarily limited.
Just to follow up on this issue, we haven't forgotten about it :) We are designing/preparing two features that will empower users to integrate such workflows nicely in Dune.
Summary
It would be nice to be able to reuse code across multiple OCaml-syntax dune files.
#use
inside OCaml-syntax dune files seems like a straightforward way to do so, and it'd be nice to be not disallowed from doing so.Background
OCaml standard library is lacking in many basic functions, stuff like
concatMap
anddrop
andtake
andcompose
, etc. Third-party libraries that fill in the gap cannot be used in the dune file generation (otherwise one would have to maintain library dependencies of the dune files themselves, a ridiculous concept). Consequently, the need arises to write helper functions to do stuff likePrintf.printf (%s %s)
everywhere)Example context
Suppose a project for some data-serialization tool contains two testing programs: one for XML files, another for JSON files. The directory layout is as follows:
Inside each dune file, a
rule
stanza is generated to run themain.ml
executable on each one of the JSON/XML cases to produce some output, something likeMany parts of the generator logic would be the same for both dune files; the functions producing the list
test_cases
; and S-expression functions such asfield
in the example above would all be the same logic. In the spirit of DRY, we want to avoid copying-and-pasting the same logic across the dune files but would like to share some common dune module across them.To do so, we avoid importing common functions via OCaml's module system (e.g. something like
open Dune_common_fns
), because that would require some sort of meta-management for the module dependencies of the dune file. Instead, something dumb and simple, like the#use
directive, suffices to reuse some scripting code without requiring sophisticated dependency management systems.Thus we move the common logic to a folder such as
src/dune_common/dune_common.ml
and call#use "../dune_common/dune_common.ml"
in both generating dune files. This would be nice, except now dune complains that#use is not allowed inside a dune file in OCaml syntax
.The question is: why? What is the rationale behind this disallowing? There are clear reasons being able to
#use
is convenient in generated dune files; does whatever rationale there outweigh these reasons? If there really is a good reason to disallow this, then what's a suggested/canonical/good way to solve the problem of re-using logic across dune scripts?