Closed seanhinde closed 3 years ago
The dependencies are used to build the DAG, which is in turn used to propagate changes and timestamps across files. It is not used to define what needs to be compiled or when at this point in time, just to update all the non-artifact vertices in the graph and later allow proper file analysis. So if file A depends on B and B changes, we will update the last modified timestamp of A to be the same as B.
The actual scheduling and ordering of files to be built is handled in the needed_files/4
callback (as documented in http://rebar3.org/docs/extending/custom_compiler_modules/), which is where it is expected you do the analysis of the files:
%% do your own analysis aided by the graph to specify what needs re-compiling.
%% You can use this to add more or fewer files (i.e. compiler options changed),
%% and specify how to schedule their compilation. One thing we do here for
%% erlang files is look at the digraph to only rebuild files with newer
%% timestamps than their build artifacts (which are also in the DAG after the
%% first build) or those with compiler options that changed (the
%% compile_and_track callback lets you annotate artifacts)
needed_files(G, ["/path/to/all/files.erl", ...], [{".beam", "/path/to/ebin"}], AppInfo) ->
%% the returned files to build essentially specify a schedule and priority with special
%% option sets
%% Files that _must_ be built first like those in parse transforms, with
%% different build options
{{["/top/priority/files.erl"], CompilerOpts},
%% {Sequential, Parallel} build order for regular files, with shared
%% compiler options
{{["/path/to/file.erl", ...], ["other/files/mod.erl", ...]}, CompilerOpts}}.
We do the same thing in rebar3_compiler_erl
, and consider a broader set of changes. For example, we compare the timestamps of the build artifact to the source file that created it, but also the compiler options that were used for the build artifact with the current one (a file that hasn't been changed but whose compiler options have been modified needs to be recompiled as well).
I've been toying with a simpler sample app where I rewrite my blog engine to have the template files' changes tracked through the DAG, I'm hoping to have it ready some time this week.
@ferd does this mean that if A depends on B, then the needed_files/4
call for B has to figure out that A depends on it and should be rebuilt after B? 🤔
or is it in the needed_files
for A that we state that B is a dependency that needs to be rebuilt before A?
I had in mind that the needed_files/4 call was only to do with static configured "Compile these types of files first" files. I hadn't figured that dependencies/3 creates a DAG that then needs to be manually unravelled as a final step in needed_files/4 to create the compilation order.
Maybe easiest for us to just place the output of caramelc sort-deps directly in the return from needed_files/4
Since you have a graph, and that the erlang digraph module supports these functions we only do something like:
needed_files(G, FoundFiles, OutMappings, AppInfo) ->
NeededErlFiles = find_modules_worth_rebuilding(FoundFiles, ...), % using stamps, artifacts, compiler opts; ignore headers
SubGraph = digraph_utils:subgraph(G, NeededErlFiles), % get all the files and deps of those needing a rebuild
DepErlsOrdered = digraph_utils:topsort(SubGraph), % give us a topological sort of it all
By then DepsErlsOrdered
contains all files that need rebuilding because they have changed, in the order they need to be compiled. This gives us a safe sequential compiling order. You can stop there if you want.
To parallelize it, if safe (we can for erl files, but I can't for my blog because it relies on some stateful things):
{Sequential, Parallel} = lists:partition(
fun(Erl) -> lists:any(
fun(Edge) ->
{_E, _V1, _V2, Kind} = digraph:edge(Graph, Edge),
Kind =/= artifact
end, digraph:in_edges(Graph, Erl)) end,
lists:reverse(DepsErlsOrdered)
),
This splits whether files are depended on or not. Those that are depended on are sequentially built, those that aren't are kept parallel. In some cases you have other factors that may be at play, so we can't schedule the parallelism ourselves.
Super, thanks. I will have a go along those lines
Yeah it sounds like a pain in the butt, until you just decide to use the digraph module's algorithms. It pretty much contains all we need aside from the annoying partitioning stuff.
Haha, right. I always put digraph in the same mental bucket as sofs. Too much brain power needed unless really needed :)
and BTW updated docs are much clearer. I hadn't refreshed my browser tab for a few days so was still on the old doc
Now looking good using the code to pull out the ordered module list directly from the digraph as suggested. Thank you!
To build the correct digraph my dependencies are upside down from where I first had them. So now if B uses code in A I return [B] in the deps for A. The compile order I need is A then B. Do I have this correct?
I think I also need to look at what updates the stored digraph. Adding a new file to my project didn't seem to be enough. I will dig some more on that.
(BTW, off topic - my 15 year old son is learning erlang from learnyousomeerlang with the goal of building a minecraft server. He loves it. Thank you!!)
if B uses A, dependencies(B, ...) -> [A].
Basically, when analyzing the code of B
, you should be able to find the imports and things that A
provides. So the callback returns "B has dependencies on A". Within the DAG, the edge in the graph is also of the form "has a dependency on" (B -> A).
Glad to hear LYSE is still seeing usage :). I'm in talks to see if we're gonna do a second edition at some point.
Coming back to this. If I implement dependencies(B, ...) -> [A].
and dependencies(A, ...) -> [].
topsort returns the order [B, A]
and rebar3 tries to compile B first.
My dependencies function is:
dependencies(File, _Dir, SrcDirs, State) ->
SrcFiles = lists:append([src_files(SrcDir) || SrcDir <- SrcDirs]),
Cmd = "caramelc sort-deps " ++ lists:join(" ", SrcFiles),
rebar_api:console("Cmd: ~s", [Cmd]),
{ok, Res} = rebar_utils:sh(Cmd, [abort_on_error]),
SortedDeps = string:tokens(Res, " \r\n"),
rebar_api:console("File: ~p", [File]),
%% Deps = tl(lists:dropwhile(fun(D) -> D =/= File end, SortedDeps)),
Deps = lists:takewhile(fun(D) -> D =/= File end, SortedDeps),
rebar_api:console("Deps: ~p", [Deps]),
Deps.
and needed_files:
needed_files(G, FoundFiles, Mappings, AppInfo) ->
FirstFiles = [],
%% Remove first files from found files
NeededErlFiles = [Source || Source <- FoundFiles,
not lists:member(Source, FirstFiles),
rebar_compiler:needs_compile(Source, ".erl", Mappings)],
Opts = rebar_opts:get(rebar_app_info:opts(AppInfo), caramel_opts, []),
Opts1 = update_opts(Opts, AppInfo),
SubGraph = digraph_utils:subgraph(G, NeededErlFiles), % get all the files and deps of those needing a rebuild
DepErlsOrdered = digraph_utils:topsort(SubGraph),
rebar_api:console("Needed: ~p", [NeededErlFiles]),
rebar_api:console("DepErlsOrdered: ~p", [DepErlsOrdered]),
{{FirstFiles, Opts1}, {DepErlsOrdered, Opts1}}.
Producing output:
~/Build/rebar3/rebar3 compile
===> Analyzing applications...
===> Compiling rebar3_caramel
===> Analyzing applications...
Needed: []
DepErlsOrdered: []
===> Compiling rebar3_caramel
===> Verifying dependencies...
===> Analyzing applications...
Cmd: caramelc sort-deps /home/sean/Projects/caradeps/src/caradeps_a.ml /home/sean/Projects/caradeps/src/caradeps_b.ml
Cmd: caramelc sort-deps /home/sean/Projects/caradeps/src/caradeps_a.ml /home/sean/Projects/caradeps/src/caradeps_b.ml
File: "/home/sean/Projects/caradeps/src/caradeps_b.ml"
Deps: ["/home/sean/Projects/caradeps/src/caradeps_a.ml"]
File: "/home/sean/Projects/caradeps/src/caradeps_a.ml"
Deps: []
Needed: ["/home/sean/Projects/caradeps/src/caradeps_b.ml",
"/home/sean/Projects/caradeps/src/caradeps_a.ml"]
DepErlsOrdered: ["/home/sean/Projects/caradeps/src/caradeps_b.ml",
"/home/sean/Projects/caradeps/src/caradeps_a.ml"]
Compiling: "/home/sean/Projects/caradeps/src/caradeps_b.ml"
===> sh(/home/sean/.opam/default/bin/caramelc compile /home/sean/Projects/caradeps/src/caradeps_b.ml)
failed with return code 1 and the following output:
Unbound module Caradeps_a
For the time being I have it working by reversing the logic (with the commented out line in dependencies/4) but I'd like to get to the bottom of this. Thoughts? Suggestions?
I'm guessing this line in how we partition things: lists:reverse(DepsErlsOrdered)
is what makes it work. The topological sort has to be reversed I guess but ought to be safe regardless.
Thanks for the confirmation. Will run with lists:reverse/1
Another topic, and then I think I have it done - I find I have a use case for access to some of the global state in my compiler module.
The purpose is to find the path to where the plugin is installed. That will allow me to locate the caramelc
compiler executable.
One solution could be to hand the rebar_state:t()
to the context callback which could put anything needed into some local state accessible from the other callbacks - like your dependencies_opts, but passed to all callbacks.
Or maybe pass the AppInfo of the plugin itself.
I have a workaround for now, but it feels like some more sacrifice of purity in the callback API might help in a few ways.
If your plugin has the data in priv_dir, you can call code:priv_dir(PluginAppName)
and you'll get it right away. This would usually be the way I'd track this. That being said yeah, the plugin callbacks are a tad restrictive. I'm not too sure what would be the safe way of extending them without breaking anything.
Project
I'm building a plugin for caramel, which relies on the new module dependencies feature of custom compiler modules
Environment
Current behaviour
What I see at the moment is that the dependencies don't change the compilation order. I haven't dug too deeply, but here is my dependencies callback:
And the debug output:
So for file caradeps_b.ml I'm returning [caradeps_a.ml] as dependencies (because caradeps_b.ml tries to call a function in caradeps_a.ml), and caradeps_a.ml dependencies are [].
What I see is that rebar3 first tries to compile caradeps_b.ml, which correctly fails on the missing dependency
I also tried swapping the deps return values around to check my understanding - same result, caradeps_b.ml is tried first.
Expected behaviour
Dependencies are compiled first
Would be great to have another pair of eyes on this, thanks.