proposal: avoid_library_cycles

jakemac53 commented 4 weeks ago

avoid_library_cycles

Description

Library cycles should be avoided, because they can negatively affect hot reload and other tooling.

Details

It is challenging for tools to deal efficiently with large library cycles, and in many cases it means that any change to any library in the cycle will have to recompile (or re-link at least) every single library in the cycle.

Breaking up your libraries into smaller parts, or sometimes separating interfaces from implementations, can enable you to break these library cycles, leading to a better development experience.

Kind

Encourages good organization and loose coupling between libraries.

Bad Examples

a.dart:

import 'b.dart' as b;  // lint, avoid_library_cycles

final variableThatUsesB = '${b.variableInB}'; 

final someOtherVariable = 'hello';

b.dart:

import 'a.dart' as a; // lint, avoid_library_cycles

final String variableInB = '${a.someOtherVariable}';

Good Examples

We can break the cycle, by moving someOtherVariable into a new library, c.dart:

a.dart:

import 'b.dart' as b;

final variableThatUsesB = '${b.variableInB}';

b.dart:

import 'c.dart' as c;

final String variableInB = '${c.someOtherVariable}';

c.dart:

final someOtherVariable = 'hello';

Discussion

I analyzed several apps from itsallwidgets.com, under the suspicion that many apps devolved into a state where they would have a single large library cycle. The result is here https://gist.github.com/jakemac53/6ca8d58723f243926c1494323c91e664 (keys are cycle size, values are the number of library cycles of that size).

You can see that for these example apps, while some are well structured, it was indeed common to have at least one large library cycle, mixed with a bunch of single library cycles. In the worst case, the large cycle was 268 libraries, over 80% of the app.

For the macros feature in particular, we are struggling with a way to deal efficiently with these large library cycles, and currently must re-run all macros across the entire cycle whenever any edit is made.

Discussion checklist

[ ] List any existing rules this proposal modifies, complements, overlaps or conflicts with.
[ ] List any relevant issues (reported here, the SDK Tracker, or elsewhere).
[ ] If there's any prior art (e.g., in other linters), please add references here.
[ ] If this proposal corresponds to Effective Dart or Flutter Style Guide advice, please call it out. (If there isn't any corresponding advice, should there be?)
[ ] If this proposal is motivated by real-world examples, please provide as many details as you can. Demonstrating potential impact is especially valuable.

jakemac53 commented 4 weeks ago

We could also make this something like avoid_large_library_cycles, where it only triggers if the cycle is greater than some number of libraries, but it is hard to know where to draw the line.

Another idea is to have a lint which triggers on library cycles which are not in the same exact directory as the current library, but does allow cycles with sibling libraries, just not across different directories.

jakemac53 commented 4 weeks ago

We may also want a quick fix associated with this lint, or possibly several quick fixes.

The easiest one to write would be a quick fix that adds a more specific import, if that would break the cycle. So if you are importing a "barrel library" (one that just exports a bunch of other libraries), and importing the more specific one would break the cycle, then we should just suggest importing the more specific one.

scheglov commented 4 weeks ago

I wonder if the size of the cycle should be configurable. For example, I'm looking currently at a tiny cycle:

[3] {package:analyzer/dart/analysis/features.dart, package:analyzer/src/dart/analysis/experiments.dart, package:analyzer/src/dart/analysis/experiments_impl.dart}

But I might have to use a cycle of two libraries anyway. We need interface FeatureSet, and its implementation ExperimentStatus. And FeatureSet has convenience factories to create ExperimentStatus instances.

I know that we don't have configuration for lints yet, but if it allows something useful, we can add it.

bwilkerson commented 4 weeks ago

I know that we don't have configuration for lints yet, but if it allows something useful, we can add it.

We do not, and we have actively worked against having configurable lints because of our experience (or more relevantly our user's experience) on a previous project that allowed such configurability.

bwilkerson commented 4 weeks ago

I'm not convinced that a lint is the best way to solve this particular problem, but I'm not opposed to exploring the possibilities.

On which library / libraries would the lint be reported when there's a cycle? The general guideline is that there should be a single diagnostic for a single problem, but in this case it isn't clear where to report it. For recursive references (such as between constants) we report it on all of the references, but those are usually fairly small cycles. If we have 268 diagnostics I think that's unreasonable.

jakemac53 commented 4 weeks ago

On which library / libraries would the lint be reported when there's a cycle?

On every single import which is a part of the cycle.

This means that you can ignore the imports for cycles that are intentional, but you also can't accidentally make the cycle larger without realizing it - you would have to add more ignores.

It also gives you the most immediate/actionable feedback possible when you have introduced a cycle.

jakemac53 commented 4 weeks ago

I'm not convinced that a lint is the best way to solve this particular problem

I think there are advantages to it being a lint:

it is opt in (we likely wouldn't add this to core/recommended, at least not for a while, it would be very noisy).
we have established conventions for ignoring lints, so you can incrementally migrate/fix the issue.
while additional/separate tooling would likely be useful, the lint does directly point you to the problem imports, in a way that is manageable. I think it is a good first step to beginning to address this issue.

If we have 268 diagnostics I think that's unreasonable.

This would only happen when first opting in to the lint, or if adding a particular import caused a new cycle it would scale based on the size of that cycle. This also has the nice benefit of once your project is clean, adding a new import which introduces a cycle immediately shows you all the libraries involved in that cycle, and the specific imports (edges) which are problematic.

scheglov commented 4 weeks ago

What makes this library to be a part of a library cycle? Everything! :-D

bwilkerson commented 4 weeks ago

while additional/separate tooling would likely be useful, the lint does directly point you to the problem imports, in a way that is manageable.

That's where we differ. I don't think that the information is manageable. Nor am I convinced that it's actionable in this form.

And it seems to me that that argument implies that a different tool wouldn't be able to point a user to the problem imports, which isn't true.

This would only happen when first opting in to the lint, ...

If I enabled this lint and was presented with 268 diagnostics, I'd probably decide that I couldn't deal with that size of a problem and I'd disable the lint thinking that maybe I could look at it later, and then never get around to looking at it.

If I had a tool that would help me figure out how to reduce the size of the cycle, even if it didn't get it down to an acceptable size, then I'd look at the output to try to decide whether the approach worked for my needs.

... or if adding a particular import caused a new cycle it would scale based on the size of that cycle.

Not if part of the new cycle was previously a cycle of its own and I'd already added ignore comments for it, then I'd only see part of the cycle and couldn't get any view of the whole cycle.

This also has the nice benefit of once your project is clean, adding a new import which introduces a cycle immediately shows you all the libraries involved in that cycle, and the specific imports (edges) which are problematic.

Again, not if some of the imports involved in that cycle were previously ignored.

I don't know whether we have time to build a better tool, but I'm not at all convinced that a lint would be better than nothing.

jakemac53 commented 4 weeks ago

That's where we differ. I don't think that the information is manageable. Nor am I convinced that it's actionable in this form.

It isn't very manageable when you first enable the lint, I agree. It just tells you there is a large problem.

However, once you have fixed up your project, it should be much more manageable. You can always get rid of all the new diagnostics by removing the single import you just added, even if it triggered a large cycle.

And it seems to me that that argument implies that a different tool wouldn't be able to point a user to the problem imports, which isn't true.

I agree that a different tool can do this, and likely could provide a better experience for the hard cases especially. However, I also don't think adding another tool into peoples standard workflow is something to be taken lightly. Making this a lint integrates it into the existing workflow, and we could still provide an extra tool to help with the harder cases.

Quick fixes could also be a very useful tool here, to the extent that we could implement them.

So, the point is really just to have the largest reach possible, and a lint seems to be a good way to achieve that.

If I enabled this lint and was presented with 268 diagnostics, I'd probably decide that I couldn't deal with that size of a problem and I'd disable the lint thinking that maybe I could look at it later, and then never get around to looking at it.

Sure, but it should also indicate to you the scope of the problem, such that you would hopefully try and actually fix it later on, maybe with a different tool.

Even if you just added ignores all over the place, knowing whenever you added a new import that furthered the issue might be useful.

Not if part of the new cycle was previously a cycle of its own and I'd already added ignore comments for it, then I'd only see part of the cycle and couldn't get any view of the whole cycle.

Sure, this comment assumed you had removed all your cycles. The lint message could potentially indicate the cycle size, although I don't know if that is an option for lints today (custom messages based on context).

I don't know whether we have time to build a better tool, but I'm not at all convinced that a lint would be better than nothing.

Why wouldn't a lint be better than nothing here? At a minimum, if somebody complained about long hot reload times as an example, we could point them at this lint to see the scale of the problem.

But I think there are definitely a lot of projects which would enable the lint too.

lrhn commented 3 weeks ago

Library cycles are not a problem, and they happen all the time (split a class into an interface with a forwarding constructor in one file, and the implementation in another, and you have a size-2 cycle). Linting against that would not be helpful. I'd never enable a lint that triggers on library cycles of size 2, that's just too annoying.

If large library cycles are the problem, we'll need to decide "how large is too large". That starts to feel arbitrary. (Will we update the limit if our tools get faster? Or slower?)

This feels like making it a user problem that we can't build efficient tools. (Because a solution to warning about library cycles is to move everything into one library. That won't make anything better for the tools, because the problem is not the number of libraries, but the number of potentially mutually dependent declarations.)

jakemac53 commented 3 weeks ago

I agree small library cycles are generally fine. However, you will notice that in the apps I looked at they are not actually all that common. Most libraries are either not in a cycle or in a quite large cycle (50+ libraries).

My hypothesis is that as applications grow, cycles tend to merge with other cycles (any time a cycle is introduced between two libraries, it is actually merging their two cycles). This inevitably devolves into a bunch of libraries with very minimal deps which manage to remain outside any cycles, and then a single large cycle containing the rest of the app.

Also yes, it wouldn't be any better to jam all the files into one library (and quite possibly it would be worse, at least parsing etc of other libraries in a cycle should be trivial to re-use when those files are not edited). But users should understand the issue better in this case. If you create a single gigantic library it isn't surprising if it takes a while to compile that library.

When developing Dart apps, you never get any indication about cycles, and so you never realize when this happens, but your dev experience continues to degrade, because every hot reload or re-analysis has to do a lot of extra work.

bwilkerson commented 3 weeks ago

When developing Dart apps, you never get any indication about cycles ...

In case it wasn't clear, I completely agree that this is a problem and that it would be good if someone (us or the community) could provide some tooling to improve the situation. (In a previous life I worked on a tool for exactly this problem that displayed the graph structure in an attempt to help users understand where the cycles were. It wasn't as good as I would have liked, but it did have some nice properties.)

And while a lint would alert users to the presence of a possible problem, it won't help users figure out (a) whether it's actually a problem or (b) how to fix the problem if one exists (hence my claim that the lint violations aren't actionable). It's also not helpful when a non-problem cycle gets larger for legitimate reasons.

I agree small library cycles are generally fine.

I think that some large library cycles are also fine (probably not as many as we're seeing in your data, but some). Sometimes it really is the case that you want a large strongly connected graph of objects, even though that results in a large library cycle.

dart-lang / linter