Should we re-consider zinc's incremental compilation?

bazelbuild / rules_scala

Scala rules for Bazel

Apache License 2.0

364 stars 275 forks source link

Should we re-consider zinc's incremental compilation? #328

Closed ittaiz closed 6 years ago

ittaiz commented 6 years ago

Given zinc 1.0 (http://scala-lang.org/blog/2017/11/03/zinc-blog-1.0.html) I'm wondering if we should take another look at if/how to integrate zinc's incremental compiler into rules_scala. @johnynek @gkossakowski wdyt?

johnynek commented 6 years ago

I think this is an interesting question. I think the problem is we would need to basically implement a build system in memory to do it.

If we hashed all the jar inputs to a rule, you could use that as a key to a mutable in memory filesystem that keeps all the state for zinc. Then zinc could run statefully with that Key.

That circumvents the intent of bazel: stateless compilers with all state managed by bazel.

Alternatively we could push to add a way for bazel to track compiler state. I have contemplated this in the past where rules have optional state with invalidation policies. When the rule runs bazel passes it the state. The rule can say things like a dependency change invalidates state, etc...

ittaiz commented 6 years ago

Let's try and discuss this at the conference with Damien/Ulf/Dmitry

On Sat, Nov 4, 2017 at 5:14 PM P. Oscar Boykin notifications@github.com wrote:

I think this is an interesting question. I think the problem is we would need to basically implement a build system in memory to do it.

If we hashed all the jar inputs to a rule, you could use that as a key to a mutable in memory filesystem that keeps all the state for zinc. Then zinc could run statefully with that Key.

That circumvents the intent of bazel: stateless compilers with all state managed by bazel.

Alternatively we could push to add a way for bazel to track compiler state. I have contemplated this in the past where rules have optional state with invalidation policies. When the rule runs bazel passes it the state. The rule can say things like a dependency change invalidates state, etc...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/328#issuecomment-341904423, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUIF3eBDL9FclgzGqIYwScr9oKYEsBqks5szH8_gaJpZM4QR_9h .

gkossakowski commented 6 years ago

I agree that rules (or actions to be precise) not being able to carry an internal state is problematic.

However, one could use zinc purely within one target and in-memory (in worker mode). See this discussion about bazel+zinc https://gitter.im/sbt/zinc-contrib?at=59f725fdb20c6424296673a1

On 4 November 2017 at 09:12, Ittai Zeidman notifications@github.com wrote:

Let's try and discuss this at the conference with Damien/Ulf/Dmitry

On Sat, Nov 4, 2017 at 5:14 PM P. Oscar Boykin notifications@github.com wrote:

I think this is an interesting question. I think the problem is we would need to basically implement a build system in memory to do it.

If we hashed all the jar inputs to a rule, you could use that as a key to a mutable in memory filesystem that keeps all the state for zinc. Then zinc could run statefully with that Key.

That circumvents the intent of bazel: stateless compilers with all state managed by bazel.

Alternatively we could push to add a way for bazel to track compiler state. I have contemplated this in the past where rules have optional state with invalidation policies. When the rule runs bazel passes it the state. The rule can say things like a dependency change invalidates state, etc...

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/ 328#issuecomment-341904423, or mute the thread https://github.com/notifications/unsubscribe-auth/ ABUIF3eBDL9FclgzGqIYwScr9oKYEsBqks5szH8_gaJpZM4QR_9h .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/328#issuecomment-341908978, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAQeUzrQnfo5LbpRDmt7K6ETQ_Fw_2vks5szIzrgaJpZM4QR_9h .

-- gkk

pauldraper commented 6 years ago

@lihaoyi you had been looking into this...did anything come of it?

I plan on getting Zinc incremental working in the next month or so, and don't want to duplicate any work.

The plan (which has been expressed elsewhere): use zinc in a worker, which is not sandboxed and can save state. Use zinc incremental strategy for local dev, and non-incremental for CI.

pauldraper commented 6 years ago

Current progress: Zinc is undocumented, besides example usage in zinc + pants.

I have 100 line example of zinc compilation + analysis file persistence (outside Bazel) https://github.com/pauldraper/zinc-example/blob/master/src/main/scala/Main.scala

I plan to add this Bazel as an exprimental worker strategy. I'll keep the current interface to scalac as is, since the API to zinc is difference and AFAIK there is no facade for that.

johnynek commented 6 years ago

we are pretty happy with how well caching is working for us currently, so I still stand by the feeling that getting hermetic builds + caching and remote execution is the scalable way forward.

If zinc can be a part of that, great, but I personally don't want to give up hermetic builds.

andyscott commented 6 years ago

FWIW I have a prototype of Zinc based rules with the intention of applying learnings back to rules_scala. I'm not currently concerned with incremental compilation and the primary goal was to get access to the analysis output of Zinc for downstream analysis.

pauldraper commented 6 years ago

we are pretty happy with how well caching is working for us currently,

Good to know. My belief remains that we cannot get near-sbt build times, without creating significant developer burden on creating fine grained DAGs.

If zinc can be a part of that, great, but I personally don't want to give up hermetic builds.

I assume that adding a opt-in worker strategy that persists and re-uses analysis is acceptable.

Worker strategies in general are trust-me with respect to hermeticity. This is a riskier (but acceptable, IMO) trust-me approach.

FWIW I have a prototype of Zinc based rules

@andyscott exciting. What's the status of that?

andyscott commented 6 years ago

It works, but it's early stage. And full disclosure: It's a separate repository written mostly from scratch. The intention is to use the latest and greatest Bazel patterns, experiment with solving some of the key issues faced in this repo, and then incrementally apply the learnings back here. I also needed a way to really dive into Bazel without being encumbered with lots of existing code. If you're curious to dig through my mess... https://github.com/andyscott/rules_scala_annex. https://github.com/andyscott/rules_scala_annex/blob/master/tests/compat/0/BUILD tests some code ported over from this repo.

sdtwigg commented 6 years ago

Is there a way (or is it the default) for Zinc to guarantee that the incremental compile result is identical to a non-incremental compile result? I've speculated pushing for Bazel to allow 'principled use' of an incremental compiler if Bazel had flags to run non-incremental variants of the job as a 'check' on the incremental compile.

Scala (with Zinc) was going to be my test case; however, I've never really had a chance to look deeply at the reliability of the incremental compile.

andyscott commented 6 years ago

@gkossakowski might be able to weigh in here

pauldraper commented 6 years ago

Is there a way (or is it the default) for Zinc to guarantee that the incremental compile result is identical to a non-incremental compile result?

Not generally, because there are many possible inputs to an "incremental compile". So, you could check a select number of state transitions but not all. And the zinc project has tests for that purpose.

In any worker strategy, it's possible to be nonhermetic. In fact, I believe scalac 2.12 had precisely this problem; statics were polluting subsequent compilation runs. The difference here is that we are purposefully "polluting" runs with past state rather than accidentally doing it.

Experience will tell whether zinc 1.0 has correct incremental compiles or not.

ittaiz commented 6 years ago

Actually I think Gregg turned it off for 2.12 because he found that it’s problematic On Sat, 7 Apr 2018 at 3:51 Paul Draper notifications@github.com wrote:

Is there a way (or is it the default) for Zinc to guarantee that the incremental compile result is identical to a non-incremental compile result?

Not generally, because there are many possible inputs to an "incremental compile". So, you could check a select number of state transitions but not all. And certainly zinc has tests for that purpose.

In any worker strategy, it's possible to be nonhermetic. In fact, I believe scalac 2.12 had precisely this problem; statics were polluting subsequent compilation runs. The difference here is that we are purposefully "polluting" runs with past state rather than accidentally doing it.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/328#issuecomment-379420240, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUIFwh8O2m8bkVveONXZdUZCSAGfVUJks5tmA14gaJpZM4QR_9h .

bpatters commented 6 years ago

I've been following this for a while, and have a question. Is Zinc a possibility to use, at least for developer only iterative development builds, in the near term? We have several units of compilation that if changed, trigger 100+ second rebuilds for our developers that are non-trivial to break into smaller DAG's. Momentum is building to switch our build back to sbt, because allowing developers to iterate several orders of magnitude faster is more important than the faster centralized builds that occur as part of our release process. We get a lot of benefit from bazel and I'd like to keep it, but the developer productivity is really suffering because of these long build times.

johnynek commented 6 years ago

You might want to take a look at this:

https://github.com/andyscott/rules_scala_annex

they are using zinc there.

I would like to see us write a tool to break large targets up using https://github.com/lightbend/scala-sculpt so that you could automatically split large targets. This might not be very hard, but we haven't started the project yet. Would someone on your end be interested in contributing that?

At stripe, we use caching (and are testing caches readable and writable from laptops). I don't have faith (I have seen evidence against and no strong evidence for the case) that zinc is reproducible. So, using it with caching is not a prudent plan.

I hear you about the long compilation times. We have a few bad targets in our repo as well. I think the main difference between these rules and annex are that we have been focused on meeting bazel's model of speed through reproducibility and caching. Giant targets hinder that.

cc @pauldraper @jvican who I think have visions closer to what you want.

pauldraper commented 6 years ago

@bpatters there is a functioning version of what you are describing at https://github.com/andyscott/rules_scala_annex . You can opt-in to incremental Zinc compilation by providing a local file path for workers to keep analysis files. I no longer work at Lucid, but I know that their developers are currently using this mode. Our justification in creating was:

(1) Practically it is hard to get good incremental builds in Scala. (a) Scala compilation is inherently slow. (b) There are limitations on ijar-like improvements due to complex types and macro introspection. (c) Even with small targets, the transitive graphs are still very large.

(2) rules_scala allow you to disable transitive dependencies, but IMO that is very cumbersome to work with, especially relative to the savings I've seen. FWIW, Google's choice for Java:

Bazel always passes the entire transitive classpath to javac, not only the direct classpath. This frees the user from having to aggregate their transitive dependencies manually. In other words, javac never fails because of a missing symbol, as long as every rule specifies its direct dependencies. https://blog.bazel.build/2017/06/28/sjd-unused_deps.html

(3) Creating enormous numbers of targets is unappealing because I'd rather have developers work with logical dependencies graphs. Oscar's wording of this perspective in 2016.

I know the canonical answer to this is to make finer grained targets, but to me this is not a complete solution. First, it is a bit painful to create one target for each item in the logical dependency graph just to speed up compilation. Second it conflates the real logical blocks of your system, which people can understand, with the nodes in the strict dependency graph, which might be enormous. https://groups.google.com/forum/#!topic/bazel-discuss/3iUy5jxS3S0

(4) Worker strategies already state sharing across subsequent invocations; in fact, sharing state is the entire point of having a worker. It's essentially impossible to prove them correct -- whether JVM JIT or Zinc caching, and in fact, both of these have had bugs. The question is: is the state sharing correct enough? Based on considerable recent experience, I say yes to both, but again, it's not a proof.

A little off-topic, but based on learning from rules_scala_annex, even outside incremental compilation I do think that rules_scala has good reason to adopt zinc: (a) the dependency detection (for strict deps and usused deps) is far more precise than the current state of rules_scala (b) it makes test discovery for every Scala framework easy.

If that were done, opt-in Zinc caching for workers could be a simple (if controversial) addition.

ittaiz commented 6 years ago

Re your last point of dependency detection- I’d really wish this was a module part of zinc we could depend on... On Fri, 31 Aug 2018 at 22:27 Paul Draper notifications@github.com wrote:

@bpatters https://github.com/bpatters there is a functioning version of what you are describing at https://github.com/andyscott/rules_scala_annex . You can opt-in to incremental Zinc compilation by providing a local file path for workers to keep analysis files. I no longer work at Lucid, but I know that their developers are currently using this mode. Or justification in creating was:

(1) Practically it is hard to get good incremental builds in Scala. (a) Scala is already slow. (b) There are limitations on ijar-like improvements due to complex types and macro introspection. (c) Even with small targets, the transitive graphs are still very large.

(2) rules_scala allow you to disable transitive dependencies, but IMO that is very cumbersome to work with, especially relative to the savings I've seen. Google's choice for Java:

Bazel always passes the entire transitive classpath to javac, not only the direct classpath. This frees the user from having to aggregate their transitive dependencies manually. In other words, javac never fails because of a missing symbol, as long as every rule specifies its direct dependencies. https://blog.bazel.build/2017/06/28/sjd-unused_deps.html

(3) Creating enormous numbers of targets is unappealing because I'd rather have developers work with logical dependencies graphs. Oscar's wording of this perspective in 2016.

I know the canonical answer to this is to make finer grained targets, but to me this is not a complete solution. First, it is a bit painful to create one target for each item in the logical dependency graph just to speed up compilation. Second it conflates the real logical blocks of your system, which people can understand, with the nodes in the strict dependency graph, which might be enormous. https://groups.google.com/forum/#!topic/bazel-discuss/3iUy5jxS3S0

(4) Worker strategies already state sharing across subsequent invocations; in fact, sharing state is the entire point of having a worker. It's essentially impossible to prove them correct -- whether JVM JIT or Zinc caching, and in fact, both of these have had bugs. The question is: is the state sharing correct enough? Based on considerable recent experience, I say yes to both, but again, it's not a proof.

A little off-topic, but based on learning from rules_scala_annex, even outside incremental compilation I do think that rules_scala has good reason to adopt zinc: (a) the dependency detection (for strict deps and usused deps) is far more precise (b) it makes test discovery for every Scala framework easy.

If that were done, opt-in Zinc caching for workers would be a simple -- if controversial -- addition at that point.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/bazelbuild/rules_scala/issues/328#issuecomment-417766999, or mute the thread https://github.com/notifications/unsubscribe-auth/ABUIF0BgGtXv5Dxa5hEBhwL2MGmboip2ks5uWY45gaJpZM4QR_9h .

johnynek commented 6 years ago

I would welcome bazel adding the ability to make dynamic internal targets that could automatically split up big target mudballs, but I don't have the time to try to keep pushing bazel devs on this and they have not been super excited. Even then, I'm not sure it would be faster since just building that internal graph and keeping it accurate is the part about zinc (I'm told) that causes the reproducibility problems.

We have about 60-70 developers editing maybe 500k lines of scala at Stripe and the non-transitive classpaths along with the recently enabled unused-deps errors seems to work pretty well. Our CIs are very fast (several minutes for each pull), remote caching hit rate is pretty good generally above 80% and very often above 90%.

If scala compiled as fast as java, I would probably prefer to put the transitive classpath for compilation, but since scala is so much slower, I think the occasional pain of having to think about where a depencency comes from (which can be solved by using strict-deps=warn) is well worth the win in isolation from rebuilds. We should probably document more this approach: when the deps are set up, iterate without strict-deps warn, when they are not, use that mode until you don't have warnings and it will tell you what targets to add.

I disagree that sharing state is the point of the worker, the point of the worker is to share the jitting and classloading, but still to have a stateless pure function for build. We have had very few bugs around that, and we have a test (which I think zinc does not) that our builds are reproducible.

To me, the most compelling reason to use zinc is to get the test framework support. I am not opposed to using zinc if it can pass our reproducibility tests and doesn't cause a migration pain for existing production users. I am skeptical our users would be better off investing in zinc vs building a tool to split large targets automatically, which could then leverage a fully distributed cache and remote execution, something we preclude as soon as non-reproducibility enters the picture. Trading remote execution or the ability to share artifacts built by other laptops in a large org, seems like it would be a net negative.

In any case, annex is there today for those who may have different needs. I think we should stay focused on support for big code bases, large teams, and caching/remote execution efficiency. For folks not in that boat, I'm not sure I can say that bazel + scala is there today.

We welcome contributors, and would love contributions to improve any of these aspects. But frankly, if you are not here for reproducibility, sbt and/or bloop will probably be your better bet.

jvican commented 6 years ago

(I'm not going to go into the meat of the discussion, just provide a few comments.)

@ittaiz @johnynek I'm happy to make a PR to rules_scala to add support for incremental compilation via bloop, which has better APIs than Zinc, better performance than zinc and implements a lot of optimizations a the build level (build pipelining is one, but I have some rough prototype of parallel compilation in the works). Let me know if you would accept this PR. I would be happy to take ownership of that code too.

The Zinc APIs are rough and I'm currently working to deprecate the use of them in as many build tools as possible with Bloop as it's difficult to ensure that all our users use Zinc correctly and makes it difficult to maintain compiler internals. I recommend rules_scala_annex does the same.

After I'm done with performance and correctness, getting the incremental compiler to be reproducible will be one of my priorities if a user like rules_scala would accept a Bloop implementation.

Re your last point of dependency detection- I’d really wish this was a module part of zinc we could depend on...

I discourage the use of Zinc for dependency detection. All of the internals are implementation details, we provide no guarantee about them and the best solution (and more maintainable one) is to depend on stable compiler APIs as you are currently doing with your compiler plugin. Depending on zinc to get marginally better dependency tracking is a shortcut, and this problem merits a good solution. So if you have issues with the current implementation, I propose you file tickets about the limitations and a compiler engineer could have a look at them or provide pointers to fix them.

johnynek commented 6 years ago

@jvican would you be interested in making a .md or google doc on a design proposal? It is hard for me to see what the design space looks like when both bazel and and some other tool want to own a cache. How do you invalidate that cache? Once we have a design, we can see if it would seems like a PR to consider.

I think we should be pushing towards buildfarm and remote caching, and I wonder how bloop fits into that. Since bazel is parallel and incremental at the target level, it seems the main thing we get is the ability to make giant targets, but that is an anti-pattern in bazel.

I still lean towards the highest leverage thing being a tool to split BUILD files up automatically (preserving existing public targets but minimizing the internal DAG).

Seems like a fun problem. Does no one really want to solve it!? Sculpt + some dag minimization seems like fun.

jvican commented 6 years ago

@jvican would you be interested in making a .md or google doc on a design proposal? It is hard for me to see what the design space looks like when both bazel and and some other tool want to own a cache. How do you invalidate that cache? Once we have a design, we can see if it would seems like a PR to consider.

Sure thing, would be happy to do that. But as that would take me some time, I would prefer to wait a bit to see more responses in the thread. It's also important that we set expectations and that we discuss tradeoffs, as that will shape much of the design.

I think we should be pushing towards buildfarm and remote caching, and I wonder how bloop fits into that. Since bazel is parallel and incremental at the target level, it seems the main thing we get is the ability to make giant targets, but that is an anti-pattern in bazel.

Remote compilation is not at odds with bloop. You can disable incremental compilation in the CI, while still get all the performance benefits that bloop provides over scalac even for small targets. Build pipelining (the ability to run compilation of projects before the compilations of their dependencies are finished) will give you a significant compilation speedup if you have a parallel build graph.

In fact, given how important remote caching is for bazel, I would strongly suggest that you should not enable incremental compilation in remote servers, you should only use it for local development and local caches.

Despite the better compile times you would get instead of using scalac by default, Bloop/zinc shine on speeding up incremental workflows. For example, I'm currently tweaking bloop to provide sub-second incremental compiler executions -- with the goal of providing users with instant output from their test suite or running program for each change they make in a target(s). This is an example of why a bloop integration would mean providing a more productive setting for your users.

As I see it, these would be my three goals with the Bloop integration I propose:

Making sure that bazel is still reliable and reproducible by not contaminating the remote caches.
Increasing the compilation throughput in your bazel remote servers.
Providing the fastest edit-compile-test workflow for local development.

johnynek commented 6 years ago

We currently want to use the the remote execution and caching for local development. How does that interact with bloop?

We don't want to have to locally build the whole repo, especially for users that pop into a big repo rarely. We want them to fetch the cached artifacts, and not spend 20 minutes rebuilding the first go.

jjudd commented 6 years ago

My concern with Bloop is that it seems to be solving a very similar problem as Bazel workers. It also introduces a dependency on running the Bloop server, which seems incompatible with remote execution.

From: https://scalacenter.github.io/bloop/docs/bloop-basics/

Bloop is an build-tool-agnostic build server that makes hot compilers accessible to all your toolchain — IDEs, build tools and terminals get peak performance at all times.

This is largely what Bazel workers do.

If Bloop provides a both build server and a better API for Zinc, it would be nice to see those concerns separated. That way you could write a worker using the stable and improved API for Zinc.

I would like to see us write a tool to break large targets up using https://github.com/lightbend/scala-sculpt so that you could automatically split large targets. This might not be very hard, but we haven't started the project yet. Would someone on your end be interested in contributing that?

We have bounced around ideas about doing something like this at Lucid. We have almost 1m lines of Scala to break down into smaller dependencies. A tool to take Scala code and break it into smaller projects would be great. However, I'm somewhat concerned about the number of circular dependencies we would find in our existing codebase inside of our projects.

We would like to do so something like this in the long term, but our Scala build team is a bit short staffed at the moment. We're currently hiring more people to work on this, but hiring + getting familiar with the domain takes time.

Our CIs are very fast (several minutes for each pull), remote caching hit rate is pretty good generally above 80% and very often above 90%.

Remote caching is fantastic. It is one of my favorite parts of Bazel so far. Internally, using Bazel + annex, CI populates an http cache in S3. We have proxies that proxy that s3 cache in a closet in the office and on developer's laptops. We see pretty good cache rates in dev. Here's an example of a local build using the closet cache on a commit that was previously built in CI. It takes several minutes without caching.

Target //user-service:user-service up-to-date:
  bazel-bin/user-service/user-service.jar
  bazel-bin/user-service/user-service-bin
INFO: Elapsed time: 6.145s, Critical Path: 3.17s
INFO: 625 processes: 625 remote cache hit.
INFO: Build completed successfully, 956 total actions

We run in dev and CI with strict_deps and unused_deps set to cause build errors. We added support to ibazel for running buildozer commands based on bazel output. This has been merged back upstream: https://github.com/bazelbuild/bazel-watcher/pull/134 This + Annex's buildozer messages makes maintaining a minimal classpath quite ergonomic. ibazel automatically presents a [y/N] prompt to execute a buildozer command when you stop using a direct dependency or start directly using a transitive dependency.

This does rely on a fork of bazel-deps, which adds transitivity: deps support for bazel-deps. https://github.com/lucidsoftware/bazel-deps/tree/java-import-external Sadly, we haven't had a chance to update our work and start the process of getting things merged back upstream. It's on our backlog, though.

I am not opposed to using zinc if it can pass our reproducibility tests and doesn't cause a migration pain for existing production users. I am skeptical our users would be better off investing in zinc vs building a tool to split large targets automatically, which could then leverage a fully distributed cache and remote execution, something we preclude as soon as non-reproducibility enters the picture.

In annex there is a goal of making incremental compilation configurable. We have several engineers internally who value correctness over speed, and would like to disable incremental compilation. We have a bug internally in our tools/bazel, which always enables incremental compilation in dev, so we sadly cannot comment on how deterministic annex is with incremental compilation disabled. We should know more soon, though.

I do believe supporting incremental compilation in some fashion will be necessary if we want Bazel + Scala to be widely adopted. I imagine a typical migration path will be sbt -> Bazel with large projects + optional incremental compilation -> incrementally decrease the size of build targets with the sculpt based tool and some elbow crease -> Bazel with small projects and no incremental compilation (or at least opt in incremental compilation).

It's really hard to migrate a large codebase to Bazel + break down build targets at the same time. Until build targets are small enough to be fast, you either support sbt + Bazel or developer productivity suffers. Incremental compilation provides a nice intermediate step. Even if it doesn't play nicely with Bazel.

I know that the Swift folks are at least considering incremental compilation as well: https://github.com/bazelbuild/rules_swift/issues/48

So if you have issues with the current implementation, I propose you file tickets about the limitations and a compiler engineer could have a look at them or provide pointers to fix them.

We started with https://github.com/scalacenter/classpath-shrinker for strict/unused dependency analysis. However, we found Zinc's analysis to be better. Considering annex is also compiling with Zinc, this works well. A long term compiler based solution to this issue would be great. However, it's hard to give up the better dependency analysis in the short term.

johnynek commented 6 years ago

It's a real shame we can't agree on vision given everyone has people working on related things...

Maybe it is good, but we are certainly doing the competition thing in scala build tooling and very little cooperation: sbt/zinc, bloop, mill, fury, pants, bazel (rules_scala, annex)... all trying to build some next generation tooling.

Maybe in a few years a clear winner will shake out, but in the mean time we will have a ton of losers and a huge amount of effort spent on the losing forks.

johnynek commented 6 years ago

Thinking on this more, I can see a few things we lack agreement:

simplified build definitions (bazel, fury?) vs full turing complete programming environments (sbt, pants, mill?)
does correctness + caching ultimately yield better performance that approximate, but incorrect caches (e.g. scalac caching jars by name, not content, or the cache bugs with zinc/sbt).
implicitness vs explicitness (bazel requires explicit dependency DAG, vs sbt finding implicit dependencies inside a module).

So maybe it is fine that we are exploring the space...

But here, for this repo, I think the most relevant question is: do we buy bazel's argument that correctness leads to better performance via caching and remote execution. We keep having this debate about using less correct (e.g. known non-reproducible builds) yet arguing this can lead to better performance. I think this can be true for a small enough build, but not for a build big enough to be relevant industrially.

If you do have such a small build, I think using sbt is not a bad option (or maybe try mill, fury etc...)

If you don't care about reproducibility in a large repo, I'd suggest pants. It's actually really usable and uses zinc caching. I didn't like all the weird bugs we hit with refactoring code due to buggy caching, but others seem to not mind.

My interest in bazel is to explore the space of reproducible (in other words, correct) builds, and I believe the bazel hype: if the builds are reproducible, caching makes them fast.

Of course, this requires small enough targets so each build can be O(seconds). We generally do that at Stripe, although some folks do get lazy and make mega targets with globs which take 1 minute to compile. I think this is a problem to tackle, but the way I'd like to tackle it is with tooling to split the targets.

@ittaiz can weigh in, but I hope the vision of these rules (and I hope Stripe's approach to build) is to keep the focus on reproducibility. annex, mill, pants, etc... exist for people who don't want to work towards the reproducibility-enables-speed vision (or the correctness-is-its-own-reward vision if you don't care about the speed part of the argument).

We keep hearing this "use non-reproducible builds for local dev" but I don't see how to integrate that with bazel and bazel's remote execution and cache easily. But, in any case, I think annex is very open to this idea, so maybe you can show the way and Lucid would be interested in sharing their results and we can all learn from it.

jvican commented 6 years ago

First array of comments answering @jjudd's feedback.

My concern with Bloop is that it seems to be solving a very similar problem as Bazel workers. It also introduces a dependency on running the Bloop server, which seems incompatible with remote execution.

I don't understand your concern. In the early days, I did a bloop integration in rules_scala_annex that was never upstreamed where Bloop was itself a remote worker. You can use bloop as a remote worker in bazel, there is no overlapping.

If Bloop provides a both build server and a better API for Zinc, it would be nice to see those concerns separated. That way you could write a worker using the stable and improved API for Zinc.

You can already do so. However, let me reiterate that as a maintainer and active developer of both I want bloop to become the standard way of doing compilation in build tools because it has a much better design and allows the core team (me and other compiler engineers) to fix compilation correctness and performance issues at once for all the clients.

so we sadly cannot comment on how deterministic annex is with incremental compilation disabled

Both rules_scala and annex should be as deterministic as the underlying compiler is when incremental compilation is disabled. 2.13.0-M5 is the first release that makes some promises with regards to deterministic compilation.

I do believe supporting incremental compilation in some fashion will be necessary if we want Bazel + Scala to be widely adopted. I imagine a typical migration path will be sbt -> Bazel with large projects + optional incremental compilation -> incrementally decrease the size of build targets with the sculpt based tool and some elbow crease -> Bazel with small projects and no incremental compilation (or at least opt in incremental compilation).

If you're interested in the sbt -> bazel migration problem, I've left some thoughts in https://github.com/stripe/sbt-bazel/issues/4. The ticket didn't get much attention by the Stripe team, but I think it's an interesting possibility if you're time constrained and value correctness of the migration over creating your own solution.

We started with scalacenter/classpath-shrinker for strict/unused dependency analysis. However, we found Zinc's analysis to be better.

How is it better? Have you used the last improvements that went into unused dependency analysis to detect macros?

Considering annex is also compiling with Zinc, this works well. A long term compiler based solution to this issue would be great. However, it's hard to give up the better dependency analysis in the short term.

My recommendation still stands there. Though I see the value in using it in the short term, this is probably not the best solution long-term. If something breaks or we decide not to follow some dependencies in Zinc internals, you'll be affected by that and I cannot promise a happy end.

jvican commented 6 years ago

Maybe it is good, but we are certainly doing the competition thing in scala build tooling and very little cooperation: sbt/zinc, bloop, mill, fury, pants, bazel (rules_scala, annex)... all trying to build some next generation tooling.

There's certainly competition in the build tools space, but bloop is not part of it because it is not a build tool. The plan, in fact, is for Bloop to become the de-facto tool that everyone uses to compile their code in the community. All I'm doing in this thread is proposing more cooperation in this front! I don't want everyone to roll their own Scala.js/Scala Native/test/zinc infrastructure. The compiler team is already small enough that we need to invest our efforts in the most effective way, and centralizing everything into a modular build server is the idea behind Bloop.

Maybe in a few years a clear winner will shake out, but in the mean time we will have a ton of losers and a huge amount of effort spent on the losing forks.

I hope I do not offend you Oscar, but I would worry that annex exists in the first place and I would take that as a signal that rules_scala is too conservative.

No matter what we're discussing in this thread, we're not talking about a fundamental disagreement. The incremental compilation integration could be optional, and could be integrated in the main repo.

simplified build definitions (bazel, fury?) vs full turing complete programming environments (sbt, pants, mill?)

Let's not turn this discussion into a general discussion about build tools strategies 😛 Let's stick with discussing the values of incremental compilation for the potential popularity of bazel in the Scala community.

But here, for this repo, I think the most relevant question is: do we buy bazel's argument that correctness leads to better performance via caching and remote execution. We keep having this debate about using less correct (e.g. known non-reproducible builds) yet arguing this can lead to better performance. I think this can be true for a small enough build, but not for a build big enough to be relevant industrially.

I think this is even more relevant for a big build. What would it take for me to convince you? i.e. what experiments could I do to show you that caching and remote execution is not enough to achieve a fast edit-compile-test workflow?

jvican commented 6 years ago

We keep hearing this "use non-reproducible builds for local dev" but I don't see how to integrate that with bazel and bazel's remote execution and cache easily. But, in any case, I think annex is very open to this idea, so maybe you can show the way and Lucid would be interested in sharing their results and we can all learn from it.

If you keep yourself open to the idea of merging a bloop integration (which again would not necessarily add only incremental compilation, but better compilation performance over the board with no risk to correctness), I can draft a design document and we can discuss more about how I see this playing with remote execution and caching. I do have solutions to some of the concerns you've raised.

That being said, I would be happy to work with the Lucid guys too (which have more experience in the bazel-zinc field). But, ultimately, I think we should not be splitting up our efforts. It does not motivate me at all to create a bloop integration for a small part of the Bazel community. If you cannot agree internally about what the best approach is (or allow the integrations to live in the main repo), then I prefer to invest my time in other build tools where my work is more impactful.

johnynek commented 6 years ago

Jorge, I use this tooling and generally have a few second cycle.

I’m not speculating. I actually use this all day. You are making a non-existence claim, but I actually use this system to do my work in a medium size code base (several repos linked as bazel external source dependencies totaling about 500k lines of code).

If you worked in such a repo, that was fully reproducible, and well written targets took 1-5 seconds, do you really think you would want to trade reproducibility and have an unclear impact on your caching to get the same order of performance? Also you have to do a bunch of work to test and deploy that to 60 engineers when the current system seems to be working well. I don’t think you have actually compared to a real system doing as I propose, yet you also don’t trust our experience report that small targets with caching has done very well (our second largest repo completes the CI before you can finish making the PR, the largest has some large docker images that prevent that).

Basically: lots of people are trying the wager that non-reproducibility is a trade to make (actually I think virtually everyone is). Why must every tool do this? Why isn’t pants or annex the right tool for people who don’t care about reproducibility?

jvican commented 6 years ago

I don’t think you have actually compared to a real system doing as I propose, yet you also don’t trust our experience report that small targets with caching has done very well

I don't doubt your experience report at all! Please don't get me wrong, I have not worked in such a system and I wouldn't dare to say I understand how working with bazel in big projects is like because I do not have access to these repositories.

However, I'll say that the fact that @jjudd et al have experimented with Zinc proves there's interest in a faster workflow than 5 seconds per compilation per target. I've also talked to people working on big bazel projects and from what I've learned they miss incremental compilation. In their experience, making changes to several targets becomes slower.

Ultimately the numbers depend on the granularity of your targets. Maybe at Stripe you're incredibly disciplined at splitting up the targets and containing the large compilation times. But I don't think every user of Bazel will because (1) it is time consuming, (2) changes to the dependencies of a target can often create merge conflicts with upstream (which slow down development) and (3) correctly organizing a build is a challenging task.

I'll let the contributors of annex address this point since they are more qualified than I am to judge the status quo.

Why must every tool do this? Why isn’t pants or annex the right tool for people who don’t care about reproducibility?

Because I believe a build tool is all about its ecosystem and community, more than about different technical properties. Pants is a great build tool and it has a different community of contributors and rules for languages. Annex competes with rules_scala for the same community/ecosystem, and that will cause confusion in Bazel users.

ittaiz commented 6 years ago

I'll dive into this conversation later this week and try to add my 2c.
I love the very fact this discussion is happening, I think it's important and I hope that through it we can better understand how we can collaborate.
I might be wrong but I'd like to suggest taking a bit more time to respond. It feels both of you are, to some extent, replying very fast and that can get us in a loop. I hope I'm not offending anyone by saying so.
In general, and again without diving into the details, I agree with @johnynek about favoring reproducibility over (local) speed. We have a setup where devs use the exact same cache as CI and given we have >1M LOC of scala over a few tens of repos if people start using "clean" from time to time it can kill their productivity. This is not to mean we can't collaborate (disable incremental compilation?) and in fact I think we need to see how we can collaborate.

johnynek commented 6 years ago

If anyone is interested in helping make a tool to split large targets I started this PR:

https://github.com/lightbend/scala-sculpt/pull/61

pauldraper commented 6 years ago

If scala compiled as fast as java

That is what it comes down to. If Google used Scala, Swift, Haskell, or Rust I think opinions would differ.

I disagree that sharing state is the point of the worker, the point of the worker is to share the jitting and classloading, but still to have a stateless pure function for build.

"Pure" in the sense that same input gives the same output, yes. But have a risk of being impure in that code may improperly keep state (JIT, statics).

If you don't care about reproducibility in a large repo, I'd suggest pants.

Pants is indeed a example of tradeoffs when the worldbuilders had Scala as a primary language.

We should probably document more this approach: when the deps are set up, iterate without strict-deps warn, when they are not, use that mode until you don't have warnings and it will tell you what targets to add.

FYI, by default rules_scala_annex runs in a mode where indirect and unused checks are part of a non-default output group. Not fundamentally different; just a little experiment.

I don't see how to integrate that with bazel and bazel's remote execution and cache easily.

Last I checked, remote execution doesn't work with workers.

But frankly, if you are not here for reproducibility

We keep hearing this "use non-reproducible builds for local dev"

I'm here for reproducibility. Same input, same output. The place where you and I differ @johnynek is where/how reproducibility is enforced/tested.

Bazel optionally allows users to sandbox actions. This enormously magnifies I/O latency (or at least used to). So it can be turned off or on. Users who turn it off are not asking for unreproducibility. They are opting out of a reproducibility enforcement mechanism due to performance overhead. Workers allow for state sharing. Users who use workers are not asking for unreproducibility. They are opting out of a reproducibility enforcement mechanism due to performance overhead. Users who give Zinc the previous state are not asking for unreproducibility. They are opting out of a reproducibility enforcement mechanism (which is to hide the previous state from Zinc).

I hope the vision of these rules (and I hope Stripe's approach to build) is to keep the focus on reproducibility. annex, mill, pants, etc... exist for people who don't want to work towards the reproducibility-enables-speed vision

How does that stand with https://github.com/bazelbuild/rules_scala/issues/125#issuecomment-279471684 ?

Re your last point of dependency detection- I’d really wish this was a module part of zinc we could depend on...

To me, the most compelling reason to use zinc is to get the test framework support.

I think we should stay focused on support for big code bases, large teams, and caching/remote execution efficiency.

To be clear....Zinc can be used without stateful incrementality. In fact, non-stateful is default of rules_scala_annex.

The point of the Zinc project is to understand the dependencies of inputs and outputs while compiling Scala. If you have that, then you can do a number of things, including incremental builds but also including indirect deps detection, unused deps detection, and test dectection.

Using in stateful and non-stateful modes is as easy as whether you provide it the analysis from last time. That's it. Having the former option in rules_scala doesn't hurt the viability of the scala-sculpt project.

bpatters commented 6 years ago

Wow great discussion. To clarify the scenarios I'm talking about with Developers, caching is not relevant. The scenario is I have a freshly completed build and I'm not writing code to develop a new feature. As simplistic, but realistic, scenario is I make changes to one file to fix a bug in a previous commit where I had a typo in a column name making for a slick table. This changes Node A in the DAG. I then write an integration test to verify it in node T. In order to compile T bazel has to build Node A, B, C, D, E and then T.

This complete DAG build for this 1 line change can take over 2 minutes in our situation. This makes iteration time for developers adding functionality to the product huge compared to SBT, which takes about 2-3 seconds to compile.

Breaking up our build into smaller nodes is part of our long term plan, though restructuring our build to the nano level so that bazel works well with it is not ideal, but smaller targets than we have is certainly possible. However; there is no guarantee that smaller targets makes the DAG wider, and not just deeper, thus not improving the build time at all.

pauldraper commented 6 years ago

It took me a minute to parse that. "To clarify the scenarios I'm talking about with Developers, distributed caching is not relevant".

And that's true. The compile/change one line/compile workflow isn't helped at all by distributed caching. (Distributed execution, on the other hand, can potentially help, depending on the number of targets to be built, the network latency, and size of the cluster.)

johnynek commented 6 years ago

I think it is a bit more subtle: when you change one line, you don't want to invalidate all your other cache. Next, due to non-reproducibility, users get trained to clean to fix issues that happen (this happened when I worked with pants and it happens still to me periodically with sbt). By contrast I recall zero times in the past year or so at Stripe that a bazel clean fixed a build issue (bazel clean --expunge has fixed things since there seem to be some lurking thread bugs with downloads sometimes failing and blocking build, the last scala related bazel cleans were related to macro non-reproducibility and we think we have gotten rid of all the instances of that we know of at Stripe).

Training users to clean hammers performance in a big repo unless, again, your cache story is really good: clean is only touching local state (which it does), and the remote cache is well populated. However, who populates the remote cache if the users are building non-reproducibly? You can imagine some side system you deploy to have workers populating a cache, or you can use build-farm (which is not a side system). Then, again, the question comes which things you submit to build-farm vs not.

If you imagine building some system to switch to local non-reproducible builds, you also have to that in a way that doesn't change cache-keys which, again, would wipe out the cache you depend on for inputs into the target that you are iterating on.

I want to comment on this because it is not simple or clear, how to either implement this or teach users a model for how to work with this.

My conjecture is that working on a tool that automatically factors large targets, but then just use normal fully reproducible builds using caching/buildfarm, is easier and more ergonomic, and in the end, offers great performance.

We could make it even more ergonomic by making a bazel proposal and working on the code to build an API to express internal dags of large targets. @pauldraper had a nice proposal on that in the past, but I don't think anyone has yet had time to implement his proposal as a sketch. I think waiting for bazel to solve this may be a long wait since the team responsible for bazel has not been highly prioritizing solving this problem.

pauldraper commented 6 years ago

I think it is a bit more subtle: when you change one line, you don't want to invalidate all your other cache.

Yes, that is a good way of putting it. On a practical level, I think it is very hard to take a typical Scala code base and create enough targets so that the Bazel cache invalidation is even close to the Zinc cache invalidation. I.e. for @bpatters to go from a 2-3s incremental compilation with sbt to < 4-6s with Bazel.

This is really a lot "well MY code base". It would be interesting to write a Bazel build for a well-structured, large-ish OSS Scala code base and test this out. Like https://github.com/playframework/playframework .

@pauldraper had a nice proposal on that in the past, but I don't think anyone has yet had time to implement his proposal as a sketch.

For reference: https://docs.google.com/document/d/16iogGwUlISoN2WLha2TAaUdpYCjRiVQ2sRQ7--INxkg It has a lot of interesting uses. E.g. testing with a automatically determined number of shards. Or quickly scanning file dependencies of TypeScript and automatically creating a granular action graph. It's really useful if you have a short-ish action that uses file contents to inform the structure or number of of long-ish actions. I'm skeptical it could work for specifically for Scala compilation, but it'd be great if it could.

I think waiting for bazel to solve this may be a long wait since the team responsible for bazel has not been highly prioritizing solving this problem.

Yeah. Bazel would solve this and a dozen other things tomorrow if C/C++ (which make extensive use of dependency pruning and other advanced features) were in Skylark. Ah well.

By contrast I recall zero times in the past year or so at Stripe that a bazel clean fixed a build issue

I've had several times I needed to bazel clean, related to bugs in workers (i.e. one of the ways to remove reproducibility protections) -- which, admittedly, I was at the time actively developing and so it was "my fault".

However, who populates the remote cache if the users are building non-reproducibly?

CI or build farm. Lucid uploads from CI. Google uploads from build farm. Uploading from dev environments is dangerous business; https://github.com/bazelbuild/bazel/issues/3360 .

Ulf: "As such, using a user-writable shared cache is something that we'll warn people against."

johnynek commented 6 years ago

Maybe I can make my point more quantitatively.

Pclean: the probability you run a clean before each build.
Tclean: the time to run a clean build of a target.
Ttarget: the time to run a scalac/non-incremental build of a given target
Tzinc: the time to run a zinc build for a given target

What we are debating is if the following predicate is true, in which case imperfect caching via zinc is a win:

Pclean * Tclean + (1 - Pclean) Tzinc < Ttarget

If this is true, a noisy cache that you sometimes clean is still a win on average. The condition for this is:

Pclean < (Ttarget - Tzinc) / (Tclean - Tzinc)

We can readily evaluate these: at Stripe, most Ttarget values are something like 1-10sec, but we have metrics on it. The worst case are some very fat ill-maintained targets that people keep throwing junk in that have gotten up to 200sec. At Stripe in our largest repo, Tclean ~ 20 min (1200sec). Let’s say Tzinc ~ 1s.

So, plugging this in we need:

Pclean < (10 sec - 1sec) / (1200sec - 1sec) = 0.75%

This is very rare: you need to keep your confidence for a period of 133 builds.

That’s assuming 10sec average Ttarget. If you assume Tzinc = 0, you need Pclean < Ttarget/Tclean. Since in a monorepo, Tclean gets very large Pclean needs to be going to zero for this to be a win.

If, with smaller targets you can get your build time down to 2 seconds, that expands up to cleaning only once every 1199 builds ((2-1)/(1200-1)). Furthermore, an expert scala user may have an intuition for when to clean, but once clean is common in your workflow, neophytes are running clean every time anything looks remotely fishy, so they get particularly hammered and see even worse performance than experts who can more accurately guess which is a likely to be an actual build error vs an error in the user code. I have seen this play out both at Stripe and Twitter. We actively tell people to file a JIRA if bazel clean fixes anything for you since it never should. We try to talk people out of ever thinking they should even try that. At twitter, with zinc-enabled-pants we had a script called wash-pants which nuked all possible state it could find and rebuilt the repo: we had to tool up cache cleaning because it was frequent enough.

Now, some people may be in a world where Ttarget = Tclean, e.g. they are effectively buidling 1 target. Hopefully it is clear why bazel is bad here: Pclean approaches 1 when the approximate rebuild solution is still better. It sounds like @bpatters is in this regime, so sbt probably is a lot better.

My case is this: we should be optimizing for the case where Tclean is huge (say 20-30 minutes), if you don't expect to be in this regime, you should probably use sbt. Next, the targets need to be small (10 seconds or less), again, if you can't dedicate the time to factor your build, bazel is going to be a bad time now.

If we redirect the energy of this thread to a tool to factor the build we could get some real advances here (e.g. take https://github.com/lightbend/scala-sculpt/pull/61 and make an optional output of the rules a list of targets to factor itself into N smaller targets).

I want to help make it easier to adopt bazel. I understand that big targets are painful to convert to rules_scala, but I think we need to directly solve that in tooling and not try to paper over the issue by encouraging large targets enabled by a non-reproducible compiler.

Lastly, even if you use a non-reproducible compiler locally, no one on the thread is arguing to make CI non-reproducible. With large targets even with caching, CI will be slow because you will tend to see invalidations. So, you will have a bad time with large targets in CI if you go this route, and a slow CI is also bad.

To me, the only solution here is better tooling to split large targets. After we have that, we can consider pushing again to expose bazel's support for dynamic build graphs to allow bazel to automatically split targets based on discovered dependencies so it can manage the caching (i.e. @pauldraper 's proposal).

Personally, I'm not interested in yet-another-non-reproducible-build-tool. I think we have plenty of those. I want to help advance the state of the art and move us to a world where we expect reproducibility and we leverage it for amazing caching. I want small targets to enable amazing parallelism in my build. I hope people who are excited about this vision help bazel and rules_scala get there.

jjudd commented 6 years ago

In all of this discussion, I still believe that toggleable and defaulted off incremental compilation is best path forward for Bazel + Scala. It provides a migration path for large repos that currently always use Zinc while preserving reproducibility for those whom it is a requirement.

At Lucid our sbt build currently has similar reproducibility issues like the ones you mentiond with Pants. We have had a few cutely named commands over the years that are equivalent to wash-pants. New engineers have a hard time telling apart Zinc reproducibility problems from actual compile errors, so they tend to clean too frequently. We have a monorepo, so clean builds are slow without Bazel's caching.

That being said, if a productive and fast dev workflow in Bazel required us to, upfront, refactor all of our Scala codebase to follow the 1:1:1 rule, then we would have likely never have successfully migrated to Bazel. Maintaining two build systems for months at a time is challenging and expensive.

We are currently breaking up our TypeScript targets into smaller build targets, and it is a large amount of work. It is going to take a similar amount of effort to refactor our Scala code. It is just too easy to create circular dependencies within build targets. A tool would majorly help with this, but I doubt it would decrease the effort enough to make blocking the migration worth it.

Toggleable incremental compilation enables a large Zinc compiled Scala codebase to migrate large build targets to Bazel and maintain their current developer speed and reproducibility issues. Then, over time, they can migrate targets to follow the 1:1:1 rule and stop using incremental compilation. In the meantime, users who value reproducibility over speed can disable incremental compilation using a local and gitignored .bazelrc.

Over time, Bazel may improve the power of Starlark as they migrate native rules, thus decreasing the need for incremental compilation. Perhaps Zinc will continue to improve and the probability of reproducibility issues will decrease, thus making incremental compilation more attractive. Perhaps we write a tool that is so powerful it is capable of migrating the entire Lucid codebase to the 1:1:1 rule with minimal effort.

In the meantime, toggleable incremental compilation seems like a compromise that defaults to reproducibility while enabling migration of large Zinc compiled Scala codebases.

I do not think this effort is in opposition to breaking down builds into smaller targets using Starlark/Bazel or with a tool. I view it as complementary. I see an opportunity to grow the Bazel + Scala community if we can provide an easy migration to Bazel by converting sbt targets to rules_scala targets using optional incremental compilation and https://github.com/stripe/sbt-bazel. This requires little refactoring at the start. Then codebases can use our to-be-built tool and some elbow grease to incrementally migrate to the 1:1:1 rule.

Overall, I see Bazel as a potentially very attractive build tool for Scala codebases of a certain size. I would like to find a path forward for rules_scala + Annex in such a way that meets the needs of everyone and positions Bazel + Scala for success as a next generation Scala build tool. I believe toggleable incremental compilation is key to making that happen.

Aside: in Annex incremental compilation is defaulted to off. It is toggled using a worker_extra_flag. As far as I understand, worker flags are not part of the cache key. Thus, even when using incremental compilation, you still get cache hits from CI. The disadvantage of this is that it enables or disables incremental compilation for all builds. It would be really neat to somehow toggle incremental compilation at a target level, but I am not sure how to do that off the top of my head while preserving cache hits. Perhaps we should use some kind of force_incremental_compilation_off flag per target that modifies the command line, but that is ok because the command line will be the same in CI and dev.

pauldraper commented 6 years ago

If you have distributed caching (correctly populated, reproducible artifacts, no https://github.com/bazelbuild/bazel/issues/3360 or similar), then the actual time to build the repo is Tclean' = Ttarget + (Tclean - Ttarget) * Rdownload, where Rdownload is the ratio between cache download time and compile time.

Say Rdownload = 0.1, i.e. it takes one tenth the time to download a jar as to compile it from source. (I could see it been far lower depending on your cache and network setup.)

Pclean < (10 sec - 1sec) / (10sec + (1200sec - 10sec) / 10 - 1sec) = 7%

So you need to edit, compile, edit, compile, edit, compile, etc. 14 times before cleaning.

So, you will have a bad time with large targets in CI if you go this route, and a slow CI is also bad.

True, though I'm fine with every one line change causing an extra 60s of CI time. I'm less fine with every one line change typically causing an 60s of developer time.

Personally, I'm not interested in yet-another-non-reproducible-build-tool.

Personally, I am also not interested in such a tool.

@johnynek you hold as a fundamental truth that "Zinc is non-reproducible*". That is, that given the same inputs, Zinc will give different outputs [because of other, internal state]. That principle underlies every argument, every point, every discussion, and every formula.

^{*In a 99.95% sense...as you said, even without Zinc, the Scala macros can be non-reproducible. Even Bazel itself can be non-reproducible https://github.com/bazelbuild/bazel/issues/1162 https://github.com/bazelbuild/bazel/issues/3360 https://github.com/bazelbuild/bazel/issues/4770 . Again, in the "except for this one rare bug" sense.}

And yet, though it forms the underpinnings of every discussion, not once has a non-reproducible case been volunteered.

I believe Zinc is reproducible. (James, for none of those errors have we looked and seen the cause... there's something strange about them.) And I use it, with Bazel. And other people want to use it, with Bazel. Even 2016 Oscar wanted to use it, with Bazel.

FWIW, even Google internally uses/used a persistent Java compiler. https://groups.google.com/d/msg/bazel-discuss/3iUy5jxS3S0/smqhBOa0AQAJ

we've done (hundreds of?) thousands of builds with the incremental Java compiler, and we haven't encountered any correctness issues recently.

That's why we added the worker API, which anyone can use. This allows you to make the tradeoff between performance and trust (in the correctness of the tools) yourself.

GOOGLE, the owner of the greatest Tclean in the world, intentionally built Bazel to allow itself and its users to trust the correctness of persistent compilers. (And, at least at one time, made that trusting decision for itself.)

I don't think it's possible for me to convince you that Zinc is trustworthy, i.e. reproducibly correct. As in hypothetically, even if it were, no amount of evidence would convince.

Given that, does it make sense for the "Zinc is non-reproducible" users to rules_scala or rules_scala_annex, and the "Zinc is reproducible users" to use rules_scala_annex?

jjudd commented 6 years ago

I'll echo @pauldraper here a bit. Incremental compilation provides benefits not only as a migration path, but once Zinc reaches a certain threshold of reproducibility, it becomes an attractive end state. I'm not sure how much effort is required to reach that threshold, and I'm not suggesting it should be the end state for everyone. However, it is nice for those who want to use it. The choice/toggle seems essential here.

johnynek commented 6 years ago

I hope everyone has had a chance to record their thoughts here.

I don't know if any minds have been changed, but it is good we have laid out the arguments. I am happy to link to annex or any fork of this repo for users who want to use zinc's caching, but I'm not going to add stateful zinc caching to these rules (while I am involved with the project).

If @ittaiz disagrees and the users want to go in another direction I'm happy to get out of the way and will wish you all the best. In the mean time, I'm more interested in tooling to split large targets, perhaps in sbt-bazel https://github.com/stripe/sbt-bazel/issues/9

I'm open to a PR to use zinc non-statefully similarly to how we create the workers here in order to have a better compiler API, support for the test API, assuming this can be done with zero (or very minimal) breakage to existing users and that we have no evidence or belief that it is less reproducible that allocating a new compiler on each run of the worker as we do now.

stuhood commented 6 years ago

And yet, though it forms the underpinnings of every discussion, not once has a non-reproducible case been volunteered.

@pauldraper : We should take the discussion elsewhere, but: I can point you to a few. Ping me on Twitter.

johnynek commented 6 years ago

I don't think people actually try to keep track of this, but I just saw this:

[info]   IncompatibleClassChangeError was thrown during property evaluation.                
[info]     Message: Found class cats.data.ContT, but interface was expected                           
[info]     Occurred when passed generated values (                                               
[info]       arg0 = FromFn(AndThen$83029018),                                              
[info]       arg1 = org.scalacheck.GenArities$$Lambda$33539/163349956@7ce03c1,
[info]       arg2 = org.scalacheck.GenArities$$Lambda$33539/163349956@315ab84a                              
[info]     )

when changing a trait to an abstract class with sbt 1.1.6. I see bugs like this periodically. I think no one using sbt assumes that using clean is a bug, rather than a normal part of using the tool.

It would be interesting to instrument sbt so that it logs every time a user needs gets a compilation error that is fixed with a clean compile. It would be cool if it could suck up all state and ship the before and after state on an opt in basis. Maybe then zinc could collect enough data to fix these bugs.

Alternatively, they could build a test harness that applies diffs from github and then sees if each clean compile matches the incremental compile. If zinc built such a test harness and had evidence that things worked, I'd be thrilled to reevaluate. I don't think this has so far been a priority for the zinc/sbt team.

pauldraper commented 6 years ago

@stuhood I will, thanks.

I think no one using sbt assumes that using clean is a bug, rather than a normal part of using the tool.

That could be true. With sbt, cbt, mill, scala-maven-plugin, Pants, etc. using zinc, I'd hope that someone would file an issue. I've seen an sbt compile bug, but it turned out to be due to bad ivy/maven dep caching instead.

FWIW (obviously not proof), zinc currently has no open reproducibility issues.

And Bazel itself has several reproducibility issues that have been open for months (particularly on Mac). Yet overall, it still has "very high" (and very useful) reproducibility. And both projects look to fix the very few issues that come up.

johnynek commented 6 years ago

I’m part of the problem I guess because I didn’t report this one.

It’s probably repeatable by converting traits to abstract class in the cats build.

jvican commented 6 years ago

when changing a trait to an abstract class with sbt 1.1.6. I see bugs like this periodically. I think no one using sbt assumes that using clean is a bug, rather than a normal part of using the tool.

I don't think this is true. clean is only used exceptionally. I use myself sbt more than 8 hours a day and the errors with regards to incremental compilation don't happen, by any means, "periodically". I think this is an exaggeration. Also note that you're not using the last Zinc and sbt versions, which has critical fixes.

It would be interesting to instrument sbt so that it logs every time a user needs gets a compilation error that is fixed with a clean compile. It would be cool if it could suck up all state and ship the before and after state on an opt in basis. Maybe then zinc could collect enough data to fix these bugs.

https://github.com/sbt/zinc/pull/585

The goal is to make bug reporting in Zinc effortless, and that's already merged in master.

I don't think this has so far been a priority for the zinc/sbt team.

My priority has been to solve all the issues with regards to correctness possible. But, generally, no, getting a fully reproducible zinc has not been the priority because the users of tools like Bazel and Pants are negligible compared to the majority of users we support.

Alternatively, they could build a test harness that applies diffs from github and then sees if each clean compile matches the incremental compile. If zinc built such a test harness and had evidence that things worked, I'd be thrilled to reevaluate.

I think I already registered this idea in the Zinc issue tracker somewhere. It's not possible to carry it out until we have a fully reproducible compiler, which will happen with 2.12.7 or 2.13.0.

I'm unsubscribing from this thread, if somebody wants to talk to me with regards to Zinc, drop by the sbt/zinc gitter channel or talk to me in private.