Slow application boot times due to Boundary checks during noop application compilations

remotecom commented 1 year ago

Hi Saša,

First off, thanks for the great package!

It appears boundary checks are being evaluated when the application compilation is a noop (code).

Since we have a fairly large codebase, this results in much slower application boot time when boundary is configured as a compiler (~53s vs ~6s). Of course, this isn't an issue in production as boundary isn't configured for the production environment. However, it is very noticeable when running the application or tests as part of the (local) development process.

# Without boundary configured

time mix run -e "IO.puts(1)”
// application output redacted
1

real    0m5.853s
user    0m16.410s
sys     0m2.190s

# With boundadry configured

time  mix run -e "IO.puts(1)"
//application output redacted
1

real    0m53.396s
user    1m49.204s
sys     0m6.360s

Just wondering why the boundary checks are being evaluated during a compilation noop? Is this something which can be avoided?

Boundary version used:

mix deps | grep boundary
* boundary 0.10.0 (Hex package) (mix)
  locked at 0.10.0 (boundary) c78eee15

Elixir/erlang versions used:

elixir -v
Erlang/OTP 25 [erts-13.2.2.2] [source] [64-bit] [smp:7:7] [ds:7:7:10] [async-threads:1] [jit:ns]

Elixir 1.15.4 (compiled with Erlang/OTP 25)

sasa1977 commented 1 year ago

Is the same happening with mix compile? Also, how large is the codebase, and how many boundaries do you have?

I've been using boundary on a large-ish project (about 140k non-empty LOC in lib), and I've just double checked, and noticed about 4 seconds penalty with boundary. So I can see this being a problem on a larger codebase, or perhaps with fine-grained boundaries.

Boundary always checks everything because it was a simple approach. The thing is that when a single module changes, it requires checking not only calls in that module, but also calls to that module and to any of its "submodules". The story becomes even more complicated with nested boundaries, since a change in one module might require full check of multiple boundaries.

So the problem amounts to organizing a proper cache structure and doing correct cache-invalidation, to minimize the amount of cross-module calls we have to check.

We should first try to asses where most of the time is lost, and then discuss possible approaches. I made some pointers on measuring here.

Improving this might take awhile. In the meantime, a quick workaround could be to run the boundary compiler only on CI. Typically, CIs have the OS env var set (often it's called CI, example), so you could do something like:

# mix.exs
def project do
  [
      compilers: compilers()
      # ...
  ]
end

defp compilers do
  boundary_compiler = if System.get_env("CI") == "true", do: :boundary, else: nil
  [boundary_compiler, ...] |> Enum.concat(Mix.compilers) |> Enum.reject(&is_nil/1)
end

Running boundary checks locally would boil down to CI=true mix compile --force. The --force param is needed because boundary needs to fully build the database of the calls. We could improve the UX for this, e.g. by adding the :check? option, so in mix.exs you can say boundary: [check?: false, ...] and then boundary always builds its database (which is done incrementally, only for the recompiled modules), but doesn't load it and doesn't perform any checks. Then we also supply the mix task for checking, and you can do mix boundary.check or something like that.

remotecom commented 1 year ago

Thank you for quick response. To answer your questions (and ask a few myself :smiley: ):

Is the same happening with mix compile?

Compilation does take longer with boundary enabled, but this isn't really a concern as the time added is minimal compared the time it takes to compile the project.

Also, how large is the codebase, and how many boundaries do you have?

How large is is the codebase?: The project compiles roughly 7k+ files when running mix compile --force.
How many boundaries do you have?: Around 50.

Boundary always checks everything because it was a simple approach. The thing is that when a single module changes, it requires checking not only calls in that module, but also calls to that module and to any of its "submodules". The story becomes even more complicated with nested boundaries, since a change in one module might require full check of multiple boundaries.

Thanks for the explanation. There is still something which isn’t clear to me:

Why are the checks being run at runtime (ie: the application starts)? Would it be possible to only run checks when the application is compiled?

Improving this might take awhile. In the meantime, a quick workaround could be to run the boundary compiler only on CI. Typically, CIs have the OS env var set (often it's called CI, [example](https://docs.github.com/en/actions/learn-github-actions/variables#default-environment-variables)), so you could do something like: ...

We currently already have something similar in place. This works really well for not penalizing teams which aren't working with boundaries. However, when working with boundaries locally, one would need to disable boundary when running tests or starting the application (to avoid the ~53s penalty), which is is quite cumbersome.

Running boundary checks locally would boil down to CI=true mix compile --force. The --force param is needed because boundary needs to fully build the database of the calls. We could improve the UX for this, e.g. by adding the :check? option, so in mix.exs you can say boundary: [check?: false, ...] and then boundary always builds its database (which is done incrementally, only for the recompiled modules), but doesn't load it and doesn't perform any checks. Then we also supply the mix task for checking, and you can do mix boundary.check or something like that.

Perhaps there's a misunderstanding here, we don’t currently have any issues with how boundary behaves when running mix compile. The issue is, once the application has been compiled, starting the application (eg: starting a iex terminal using iex -S mix) takes roughly 53s since boundary appears running the checks.

Thanks for your help in advance :pray:

Edited for typos.

sasa1977 commented 1 year ago

Why are the checks being run at runtime (ie: the application starts)?

Boundary doesn't do any checks (or any other kind of work) during app startup, everything happens at compile time.

But to clarify, even when you run iex -S mix or mix run, the regular compilation takes place. Even if that compilation doesn't require any file to be recompiled, the boundary compiler will perform the full check (but still at compile time).

To prove that the issue is compilation you can do the following:

Run mix compile once to make sure everything is fully compiled.
Run time mix compile.

You should see the same 50 sec overhead in step 2. Assuming that's the case, the proposed workaround should help.

remotecom commented 1 year ago

Ah I see, in that case I didn't properly understand/consider the mechanics of the application start, thanks for clarifying this.

One last (I think?) question:

Boundary always checks everything because it was a simple approach. The thing is that when a single module changes, it requires checking not only calls in that module, but also calls to that module and to any of its "submodules". The story becomes even more complicated with nested boundaries, since a change in one module might require full check of multiple boundaries.

So the problem amounts to organizing a proper cache structure and doing correct cache-invalidation, to minimize the amount of cross-module calls we have to check.

I understand that when a single module changes, it isn't trivial to determine which checks to perform. However, I don't understand why checks must be performed when nothing changes? My thinking was: if the compilation was a noop, then it meant no files were changed, which meant there would be no need for re-checking the boundaries?

We should first try to asses whttps://github.com/sasa1977/boundary/issues/49#issuecomment-1062092826 most of the time is lost, and then discuss possible approaches. I made some pointers on measuring [here](g https://github.com/sasa1977/boundary/issues/49#issuecomment-1062092826.).

In the message linked, the LOC which you suggest to drop (link) is currently an empty line. I'd be happy to comment out the correct line, just let me know which one(s) exactly :+1:.

Cheers

Edited: typos.

sasa1977 commented 1 year ago

However, I don't understand why checks must be performed when nothing changes? My thinking was: if the compilation was a noop, then it meant no files were changed, which meant there would be no need for re-checking the boundaries?

Boundary doesn't know that elixir compilation was a noop. We could add some code to identify this situation (e.g. track how many modules have been recompiled since the last compilation), but it's a non-trivial extra complexity which would solve only one instance of the problem. In fact, in my own flow noop recompilation is a rare situation. I typically change something and then rerun. And as soon as you change just one file, you still need to pay the price of 50 sec. So I don't think this would buy us much.

50 sec is definitely too long, even for a long project, so we have to figure this out. As I said, this might take some time, but in the meantime the workaround I propose is to enable boundary only on CI (as sketched in the first reply).

That way your local flow is not affected, but you still check the boundaries rules on CI.

sasa1977 commented 1 year ago

n the message linked, the LOC which you suggest to drop (link) is currently an empty line. I'd be happy to comment out the correct line, just let me know which one(s) exactly

I updated the links in the original comment to point to the correct version.

remotecom commented 1 year ago

Thanks (again) for taking the time to explain things, I really appreciate it.

I updated the links in the original comment to point to the correct version.

Perfect! So, the next step is commenting out Mix.Task.Compiler.after_compiler(:app, &after_compiler(&1, argv)) as you proposed in your original thread. Doing this and running mix compile without changing any files results in the following :

real    0m6.225s
user    0m14.445s
sys     0m1.948s

remotecom commented 1 year ago

I'm signing off for today, but I'll be checking-in again tomorrow :pray:.

sasa1977 commented 1 year ago

How large is is the codebase?: The project compiles roughly 7k+ files when running mix compile --force.

Could you provide the LOC count? E.g. the following command should give you the amount of non-empty uncommented loc. Run it in the lib folder:

git ls-files | egrep '\.exs?$' | xargs cat | sed '/^\s*$/d' | sed '/^\s*#/d' | wc -l

Also, can you share the output of ls -alh _build/dev/lib/<YOUR_APP>/.mix/? I'm interested in the sizes of two files: compile.boundary_v2 and compile.boundary_view.

remotecom commented 1 year ago

Could you provide the LOC count?

Sure, no problem.

lib (master)$ git ls-files | egrep '\.exs?$' | xargs cat | sed '/^\s*$/d' | sed '/^\s*#/d' | wc -l

480863

Also, can you share the output of ls -alh _build/dev/lib//.mix/?

Here it is:

total 41M
drwxr-xr-x 2 user user 4.0K Aug 31 09:43 .
drwxr-xr-x 5 user user 4.0K Aug 31 09:42 ..
-rw-r--r-- 1 user user  33K Aug 31 09:42 compile.app_cache
-rw-r--r-- 1 user user  40M Aug 31 09:42 compile.boundary_v2
-rw-r--r-- 1 user user 160K Aug 31 09:42 compile.boundary_view
-rw-r--r-- 1 user user 1.3M Aug 31 09:30 compile.elixir
-rw-r--r-- 1 user user   45 Aug 31 09:42 compile.elixir_scm
-rw-r--r-- 1 user user    0 Aug 31 09:30 compile.lock
-rw-r--r-- 1 user user 5.0K Aug 31 09:43 compile.protocols

sasa1977 commented 1 year ago

Could you try the following?

Compile the project with mix compile

mix run --no-start --no-compile -e ':timer.tc(fn -> view = Boundary.Mix.read_manifest("boundary_view") |> Boundary.View.refresh([Boundary.Mix.app_name()]); Boundary.Mix.Xref.start_link(); Boundary.errors(view, Boundary.Mix.Xref.entries()) end) |> IO.inspect'

mix profile.fprof --no-start --no-compile -e ':timer.tc(fn -> view = Boundary.Mix.read_manifest("boundary_view") |> Boundary.View.refresh([Boundary.Mix.app_name()]); Boundary.Mix.Xref.start_link(); Boundary.errors(view, Boundary.Mix.Xref.entries()) end)'

Paste the outputs of 2 and 3.

remotecom commented 1 year ago

Here is the output for 1:

mix run --no-start --no-compile -e ':timer.tc(fn -> view = Boundary.Mix.read_manifest("boundary_view") |> Boundary.View.refresh([Boundary.Mix.app_name()]); Boundary.Mix.Xref.start_link(); Boundary.errors(view, Boundary.Mix.Xref.entries()) end) |> IO.inspect'
{48214889, []}

The output for the 2nd command was very long, so I pasted it into a doc (2nd_command_top_traces.txt). I didn't include the modules after around the first 350 lines: this is because including these would essentially list all modules the codebase which is something I'd like to avoid. The time recorded for these modules was negligible (0.016ms), so I'm assuming this isn't of interest to you.

If you do indeed need these additional traces, I can also scramble the module names and share it this way.

sasa1977 commented 1 year ago

Thanks! Could you run the 2nd command with the --callers flag and share the result again? I ran the same on a real codebase here, and I see a bottleneck which does not appear in your output, so I can't be sure that it's the same problem. You'll need to scramble your own module names, but they don't have to be unique scrambles. I.e. if you have Module1, Module2, etc, you can replace them all with e.g. AppMod or something.

sasa1977 commented 1 year ago

And one other thing you could do before that is run the following command and paste the result.

mix run --no-start --no-compile -e 'Boundary.Mix.load_app(); Boundary.View.app_modules(Boundary.Mix.app_name()) |> Enum.each(&Code.ensure_loaded/1); :timer.tc(fn -> view = Boundary.Mix.read_manifest("boundary_view") |> Boundary.View.refresh([Boundary.Mix.app_name()]); Boundary.Mix.Xref.start_link(); Boundary.errors(view, Boundary.Mix.Xref.entries()) end) |> IO.inspect'

I'm basically trying to see how much time is lost loading the app modules. This is the part where boundary spends most of the time on a local test.

remotecom commented 1 year ago

And one other thing you could do before that is run the following command and paste the result.

Here you go:

mix run --no-start --no-compile -e 'Boundary.Mix.load_app(); Boundary.View.app_modules(Boundary.Mix.app_name()) |> Enum.each(&Code.ensure_loaded/1); :timer.tc(fn -> view = Boundary.Mix.read_manifest("boundary_view") |> Boundary.View.refresh([Boundary.Mix.app_name()]); Boundary.Mix.Xref.start_link(); Boundary.errors(view, Boundary.Mix.Xref.entries()) end) |> IO.inspect'
{1968733, []}

Could you run the 2nd command with the --callers flag and share the result again?

No problem, here it is. I replaced App.SubModule1.SubModule2etc with AppMod, I hope that's fine.

sasa1977 commented 1 year ago

Thanks. The 1st output already confirms my suspicions: most of the time is spent loading modules. If we can avoid that, we can reduce the running time to about 2 seconds in your case. That's still longish, but it's much better than 50 seconds :-) I'll let you know when I have something you can test.

remotecom commented 1 year ago

I can definitely live with 2ish seconds :)

sasa1977 commented 1 year ago

I've pushed a prototype fix to the avoid-module-loading branch. On a local project it drops me from 2 seconds to about 200 ms.

Could you please give it a try?

Steps:

Change boundary dep to {:boundary, github: "sasa1977/avoid-module-loading"}
mix compile --force
time mix compile

mix run --no-start --no-compile -e ':timer.tc(fn -> Boundary.Mix.CompilerState.start_link(); view = Boundary.Mix.View.refresh([Boundary.Mix.app_name()], []); Boundary.errors(view, Boundary.Mix.CompilerState.references()) end) |> IO.inspect'

And print the outputs of 3 and 4

remotecom commented 1 year ago

Change boundary dep to {:boundary, github: "sasa1977/avoid-module-loading"}

The mix config provided didn't seem to work, I'm assuming you meant {:boundary, github: "sasa1977/boundary", branch: "avoid-module-loading"}, this is what I used.

3:

(master)$ time mix compile

real    0m6.241s
user    0m18.604s
sys     0m1.329s

It appears the changes you made make a huge difference!

4:

mix run --no-start --no-compile -e ':timer.tc(fn -> Boundary.Mix.CompilerState.start_link(); view = Boundary.Mix.View.refresh([Boundary.Mix.app_name()], []); Boundary.errors(view, Boundary.Mix.CompilerState.references()) end) |> IO.inspect'
{1071380, []}

edit: typos

sasa1977 commented 1 year ago

The mix config provided didn't seem to work, I'm amusing you meant {:boundary, github: "sasa1977/boundary", branch: "avoid-module-loading"}, this is what I used.

duh, serves me right for not testing it :-)

Great to see this. The overhead is now about 1 sec on such a large project. Obviously there's more room for improvements, but I think this is acceptable for the moment. I'll need to clean up the code a bit and then I'll release, hopefully in the next few days. In the meantime you can use the branch version, but be aware that there's currently one bug. If you compiled the code with the previous version, and the try to compile it with the new one, there's going to be a crash. You can fix this by running mix clean or mix compile --force.

remotecom commented 1 year ago

Thanks so much for your quick response on this, the improved performance is great!

hopefully in the next few days

No rush, since boundary is only used by a few teams, and most developers don't run it locally, we can we wait until the improvements are published to hex.

However I'll be sure to share with others that if they need a blazing fast boundary, there's a branch they can use :rocket:.

sasa1977 commented 1 year ago

This is now deployed to hex (version 0.10.1). Thanks for assisting with debugging!

sasa1977 / boundary

Slow application boot times due to Boundary checks during noop application compilations #57