erlang / rebar3

Erlang build tool that makes it easy to compile and test Erlang applications and releases.
http://www.rebar3.org
Apache License 2.0
1.69k stars 515 forks source link

Dependencies are always recompiled when using `_checkouts` folder #2152

Open peterzeller opened 5 years ago

peterzeller commented 5 years ago

Environment

$ rebar3 report eunit
Rebar3 report
 version 3.12.0
 generated at 2019-08-29T11:38:25+00:00
=================
Please submit this along with your issue at https://github.com/erlang/rebar3/issues (and feel free to edit out private information, if any)
-----------------
Task: eunit
Entered as:
  eunit
-----------------
Operating System: x86_64-unknown-linux-gnu
ERTS: Erlang/OTP 22 [erts-10.4] [source] [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:1] [hipe]
Root Directory: /home/peter/apps/erlang/22.0
Library directory: /home/peter/apps/erlang/22.0/lib
-----------------
Loaded Applications:
bbmustache: 1.6.1
certifi: 2.5.1
cf: 0.2.2
common_test: 1.17.3
compiler: 7.4
crypto: 4.5
cth_readable: 1.4.5
dialyzer: 4.0
edoc: 0.11
erlware_commons: 1.3.1
eunit: 2.3.7
eunit_formatters: 0.5.0
getopt: 1.0.1
hipe: 3.19
inets: 7.0.8
kernel: 6.4
providers: 1.8.1
public_key: 1.6.7
relx: 3.33.0
sasl: 3.4
snmp: 5.3
ssl_verify_fun: 1.1.5
stdlib: 3.9
syntax_tools: 2.2
tools: 3.2

-----------------
Escript path: /home/peter/bin/rebar3
Providers:
  app_discovery as clean compile compile cover ct deps dialyzer do edoc escriptize eunit get-deps help install install_deps list lock new path pkgs release relup report repos shell state tar tree unlock update upgrade upgrade upgrade version xref

Current behavior

When using a _checkouts folder, all dependencies are compiled every time I run rebar3 eunit (or other commands).

Here is a small example project, where this is very visible since the eleveldb dependency is compiled for every run of rebar3 eunit and the use of hooks makes the C build non-incremental: mylib.zip

Expected behavior

Only changes in the main application and in _checkouts should be recompiled.

This is the case when not using a _checkouts folder:

$ rebar3 eunit
===> Verifying dependencies...
===> Compiling mylib
===> Performing EUnit tests...
.
Finished in 0.059 seconds
ferd commented 5 years ago

Do other applications in your system rely on the library in _checkouts? If leveldb itself is changing, then there is at least a reasonable expectation that apps that depend on it may also be recompiled (until analysis of changes across apps becomes fancier)

peterzeller commented 5 years ago

Only the main application (mylib) depends on the library in _checkouts (blub). Here is the output of rebar3 tree:

└─ mylib─0.1.0 (project app)
   └─ riak_core─3.1.1 (hex package)
      ├─ basho_stats─1.0.3 (hex package)
      ├─ blume─0.1.1 (hex package)
      ├─ chash─0.1.2 (hex package)
      ├─ clique─0.3.12 (hex package)
      ├─ cuttlefish─2.1.4 (hex package)
      │  └─ getopt─1.0.1 (hex package)
      ├─ eleveldb─2.2.20 (hex package)
      ├─ exometer_core─1.0.5 (hex package)
      │  ├─ parse_trans─3.3.0 (hex package)
      │  └─ setup─2.0.2 (hex package)
      ├─ folsom─0.8.8 (hex package)
      │  └─ bear─0.8.7 (hex package)
      ├─ gen_fsm_compat─0.3.0 (hex package)
      ├─ goldrush─0.1.9 (hex package)
      ├─ lager─3.6.10 (hex package)
      ├─ pbkdf2─2.0.0 (hex package)
      ├─ poolboy─0.8.4 (hex package)
      ├─ riak_ensemble─2.4.4 (hex package)
      └─ riak_sysmon─2.1.7 (hex package)
   └─ blub─0.1.0 (checkout app)

Even without changing any files, everything is recompiled:

With the dependency on blub in _checkouts:

$ rebar3 compile
===> Verifying dependencies...
===> Compiling blub
===> Compiling eleveldb
make: Nothing to be done for 'all'.
make: Nothing to be done for 'tools'.
===> Compiling c_src/eleveldb.cc
===> Compiling c_src/refobjects.cc
===> Compiling c_src/workitems.cc
===> Linking priv/eleveldb.so
===> Compiling getopt
===> Compiling cuttlefish
===> Building escript...
===> Compiling gen_fsm_compat
===> The erlang version 22.0 is newer then the latest version known to rebar_erl_vsn (21). Features introduced between after 21 will not have flags.
===> Compiling riak_ensemble
===> The erlang version 22.0 is newer then the latest version known to rebar_erl_vsn (21). Features introduced between after 21 will not have flags.
===> Compiling chash
===> Compiling basho_stats
===> Compiling poolboy
===> Compiling pbkdf2
===> Compiling setup
===> Building escript...
===> Compiling parse_trans
===> Compiling exometer_core
===> Compiling clique
===> The erlang version 22.0 is newer then the latest version known to rebar_erl_vsn (21). Features introduced between after 21 will not have flags.
===> Compiling blume
===> Compiling riak_core
===> The erlang version 22.0 is newer then the latest version known to rebar_erl_vsn (21). Features introduced between after 21 will not have flags.
===> Compiling mylib

Without the dependency on blub in _checkouts:

$ rebar3 compile
===> Verifying dependencies...
===> Compiling mylib
tsloughter commented 5 years ago

Very weird, usually this happens when an application isn't setting its vsn in the .app correctly. Any chance you can provide a reproducible repo I can play with?

peterzeller commented 5 years ago

@tsloughter I have attached a project in mylib.zip above. The .app.src files are the ones generated by rebar3 new.

tsloughter commented 5 years ago

Ah thanks, missed that :)

tsloughter commented 5 years ago

I have good news and I have bad news :(

The good news is, we know why this is happening. The bad news is, it is by design.

The issue is rebar3 builds a digraph of applications and then does a topo sort of that graph to get a single list in an order that ensures dependencies are compiled before the apps that depend on them.

Since blub is a checkout app it is always marked to compile (simply means it looks for changes, it doesn't do a recompile) and then everything after it in the list is also compiled, since as far as we know in the list they might depend on on blub.

In your case there are likely things that can be done to improve the time it takes for rebar3 to figure out that eleveldb has nothing to do. It looks like it always compiles some code because of a pre-compile hook.

In general I think this can be improved but it isn't simple. I'm thinking either some culling of the dep graph before topsort or building separate lists of deps in order that can be compiled in parallel, then blub wouldn't be in the list with your other deps.

peterzeller commented 5 years ago

Thanks for looking into this.

With your hint I was able to come up with a small workaround that I am using now:

diff --git a/src/rebar_digraph.erl b/src/rebar_digraph.erl
index 776d7b8b..98f5e021 100644
--- a/src/rebar_digraph.erl
+++ b/src/rebar_digraph.erl
@@ -19,6 +19,20 @@ compile_order(Apps) ->
                           Deps = all_apps_deps(App),
                           add(Graph, {Name, Deps})
                   end, Apps),
+    % If A is a checkout app B does not depend on A, then compile B before A
+    {CheckoutApps, NonCheckoutApps} = lists:partition(fun rebar_app_info:is_checkout/1, Apps),
+    lists:foreach(fun(CheckoutApp) ->
+            CheckoutAppName = rebar_app_info:name(CheckoutApp),
+            lists:foreach(fun(App) ->
+                    AppName = rebar_app_info:name(App),
+                    case digraph:get_path(Graph, AppName, CheckoutAppName) of
+                        false ->
+                            digraph:add_edge(Graph, CheckoutAppName, AppName);
+                        _Path ->
+                            ok
+                    end
+                end, NonCheckoutApps)
+        end, CheckoutApps),
     Order =
         case digraph_utils:topsort(Graph) of
             false ->
ferd commented 5 years ago

I haven't read that code thoroughly enough to see if that can actually happen, but you might be getting some oddities when considering transitive dependencies there such as a chain like A -> B -> C, where C is a checkout app that now can technically end up compiled before B but after A. If there's a chain of parse transforms being applied, then that can cause errors.

At least it's one of the edge cases we have to be careful about when handling this, along with other ones such as two higher-level apps both depending on a checkout that's transitive.

Your fix might be fine, I'm mostly advocating against including any of these without thorough testing.

peterzeller commented 5 years ago

My workaround only adds additional edges to the graph, so a topological sort of the extended graph should also be a topological sort of the original graph.

I also would not include my workaround in master. I think a better design would be to only recompile an app if one of its (transitive) dependencies has been changed.

filmor commented 4 years ago

This was my hack to get around this: https://github.com/erlang/rebar3/pull/2047 :)