Support iterative fixes in the LSP "Fix All" command

DanTup commented 2 years ago

The dart fix can apply multiple rounds of fixes, but the equiv LSP command (which is wired up to "Auto Fix" in VS Code, which some users like to run on-save) does not.

The dart fix command works by building overlays from the first set of changes and then running the fixes again. This will be more complicated inside the server run by the IDE as we can't modify the overlays without potentially introducing issues (even if they are applied and reverted, because the fixes are async, there's no guarantee the editor won't be sending requests that will have "misunderstood" offsets during that time).

It may be possible to lay another OverlayResourceProvider over the existing one to do this, but will need some testing - see discussion at https://github.com/dart-lang/sdk/issues/47886#issuecomment-991215561.

DanTup commented 2 years ago

@bwilkerson this was the issue mentioned earlier. Based on the text above, it seems like the issue was that dart fix was modifying the overlays (which is fine for it because they're not also being used by an editor). Your linked comments suggest we may be able to wrap another temporary overlay provider over that one, though I haven't tested that yet. I'll make a note to do so soon, as this might actually be a fairly simple fix if that works (or can be made to).

bwilkerson commented 2 years ago

@pq for visibility

DanTup commented 1 year ago

@bwilkerson I started taking a look at this, but hit a snag. The way this currently works in dart fix is:

compute some edits
apply edits to overlay
reanalyze / compute more edits

If we collect all of the edits here, we don't end up with a single change that makes the necessary changes, but rather a set of changes that need applying sequentially to arrive at the expected output. For example in the case of prefer_final_locals and prefer_const_declarations and code var a = 'test'; we end up with:

A change that says "replace var with final"
A change that says "replace final with const"

These changes conflict, and can only be applied sequentially. In order to apply them in a single operation in VS Code (such that undo buffer works as expected), we need a single set of atomic edits (at least, that's how it seems.. though I'll see if there's a way around this.. I don't see why in theory VS Code couldn't apply some edits sequentially within a single undo record).

I think some of the issues (that is, edits from second round of edits offsets are wrong because they assumed the previous round of edits have already been made) might be fixable by using a shared ChangeBuilder, however I don't think conflicting edits (as occur in the example above) would work. (edit: I don't think this is the right route, because a ChangeBuilder is based from a single document and we're producing edits based on amended documents for each iteration)... In theory I think we should be able to merge the changes, but I'm not sure if that would be a simple task.

bwilkerson commented 1 year ago

I think that merging the edits would be fairly difficult and hence error prone, but I haven't tried, so it might be more doable than I'm thinking it would be.

The only solution that comes to mind is to make all the edits as per dart fix, then return an edit to replace the entire content of the initial version with the entire content of the final version. We could potentially write a diff algorithm that would narrow it down, but I'm not convinced that it would be worth it.

bwilkerson commented 1 year ago

I also have some concerns around performance. If we're iterating then we're re-analyzing the file on each iteration, which might take longer than we want to spend before returning an edit that might or might not be applied. (Unless I'm misremembering and 'Fix All' is actually a command rather than a fix.)

DanTup commented 1 year ago

I think of those options, a full edit and diff is probably better Not doing the diff could be a problem though (which is why we do something similar for formatting - although it's not a real diff because we know only whitespace changes and can just walk two token streams together). A diff seems like the sort of thing there should be well-tested algorithms for we might be able to easily implement in Dart (if there isn't one already in the SDK somewhere 😄).

I also have some concerns around performance.

I was curious about perf too (I noticed the dart fix command on the CLI wasn't terribly fast when testing some of the fixes above on flutter_gallery), but didn't get as far as being able to measure it.

It is a command though, so we're not paying any cost until the user invokes it (we just return a static CodeAction referencing the command from the /codeActions request):

https://github.com/dart-lang/sdk/blob/15c8dad750f277753be7024cff536eb1a423a320/pkg/analysis_server/lib/src/lsp/handlers/code_actions/dart.dart#L302-L307

We can show progress notifications while it's running though (as we do for refactorings because the first invokcation of them can sometimes be a bit slow).

bwilkerson commented 1 year ago

... if there isn't one already in the SDK somewhere ...

There might be one, but I wasn't able to find it when we were implementing dart fix. We considered having a flag to display a diff of the files rather than just a "we fixed 3 things in this file" message, but it looked like it would be too much work. If you find a diff algorithm, we might reconsider.

It is a command though ...

If it's a command, then I'm not concerned about the performance (the user will know what's happening).

DanTup commented 1 year ago

I also couldn't find one in the SDK, although I did find this:

https://github.com/google/diff-match-patch/tree/master/dart

The repo hasn't been updated in a few years, but with some tweaks for null-safety the code (and tests) work in current Dart:

https://github.com/DanTup/diff-match-patch/tree/dart-null-safe

Although it doesn't seem like the project is maintained any more, so if we used this (it's Apache licence and Google-created, so I presume that would be possible) and there are any issues we'd have to fix them. Reducing it to only the useful diff-producing code gives ~1k lines:

https://gist.github.com/DanTup/a212ed0f02376322b1486af72f2fc52f

This is larger than I was expecting a diff to be, although it seems to describe some reasonable-sounding optimisations (I don't know if this is a common algorithm or something developed by that project though). Any thoughts?

bwilkerson commented 1 year ago

I'm not going to address the question of whether we can use the code. My guess is yes, but I'd want someone else to make that call.

But if we can use some or all of that code, then it probably isn't going to be any bigger than the code we'd end up writing ourselves. Meyer's algorithm is definitely one of the best. The most recent committer is still working here, so we can also reach out if we have any additional questions.

DanTup commented 1 year ago

@bwilkerson if we can get confirmation that it can be used, do you think that's the most sensible path forward? (further up you said you're not sure it's worth it, though I don't know if there are other options besides not supporting this - which might not be a problem, I guess it depends on the number of fixes may require multiple iterations.. I only know of the one given as an example above).

There may be some other benefits to having something like this - --dry-run could be updated to support a diff as you noted above, and we could also perhaps remove the custom handling for the formatter that does this by re-parsing both sets of code (although we might want to compare performance - I suspect since it's much simpler it may also be quicker).

bwilkerson commented 1 year ago

Sorry, but I don't have a definitive answer for you at this point.

... I guess it depends on the number of fixes may require multiple iterations.

There are a couple of others (I don't remember which ones at the moment), but not many.

It would be consistent with our current direction for the fix for prefer_final_locals to check to see whether prefer_const_declarations is enabled and applicable, and if so to create a single edit that gets us to the final state in one operation. On the other hand, I think I might have expressed concerns about the combinatorial explosion of checking that could happen if we go too far down this path; I have mixed feelings about taking this kind of cross-checking too far.

DanTup commented 1 year ago

I've been thinking about this again recently (as I'd really like to fix this, and look at the fix-all-in-workspace with refactor preview).

One of the ideas above was to apply the changes recursively and then compute a diff. However, one of the goals with the refactor preview was to have the edits grouped under their lints/fixes, so they could more easily be reviewed/toggled. This API requires each edit has a text description for the change (for it to be grouped under) which we might not be able to provide (or at least it could be complex) if we extract diffs from content that's already had the changes applied.

So it may be better to choose between:

merging edits (like you, I think this would be difficult, but I haven't tried it yet)
try to avoid these overlapping edits and just produce one set as noted in your last comment above (and like you, I'd have concerns around the possible explosion of these - but I don't know how common it'll be for these interactions/overlaps)

Another option is to just do nothing and not support this, or support it for common cases (like prefer final/const) but not worry about covering them all (this means in some cases if the user runs the fix a second time they might see some additional fixes, but maybe only for much less common cases)?

bwilkerson commented 1 year ago

@keertip Who's also interested in this use case.

bwilkerson commented 1 year ago

However, one of the goals with the refactor preview was to have the edits grouped under their lints/fixes ...

I can't see anyway to do that unless we either

disallow overlapping edits and make the user run fix multiple times, or
make the client merge the edits.

Neither sounds like an appealing option.

DanTup commented 1 year ago

disallow overlapping edits and make the user run fix multiple times, or

But if we combined this with fixing up the common fixes that we know overlap (as you suggested further up with final/const), we might be able to make this a much less common case?

Another option is that we just ditch the grouping and populate them all with a generic title - especially given VS Code doesn't show the title or grouping by default. The grouping would be nice though.

DanTup commented 1 year ago

I was thinking more about this, and whether we could merge the edits (rather than applying them and trying to build new edits from a complete diff as I was looking at with a full diff algorithm above). One thought I had to do this was to expand any overlapping edits to be the same range, then just replacing the first's replacement text with the replacement text from the second.

Expanding the edits such that these match up doesn't seem like it should be too complicated (we can expand an edit in either direction by just moving the start/end of the replacement range and adding the extra characters from the previous text into the replacement text). Inserts and deletes are just replacements with a zero-length range/text so I think they'd expand in the same way.

Once we have that new edit, we could also then minimize it by taking characters back off the start/end where they match the original source.

@bwilkerson WDYT? I couldn't find any existing algorithms that seem to do this (although it seems difficult to search for) but I think there only a limited number of combinations of edits we'd need to merge.

bwilkerson commented 1 year ago

The biggest problem that I'm aware of is that, on each pass, the ranges are relative to a different version of the file than the ranges for any other pass, so I think we need to have some way of mapping ranges from one pass back to the ranges that existed prior to the original pass. (Or maybe it's easier to go the other direction and then map all of the merged edits back to the original offsets in a single final pass?) Once the ranges have been normalized, then I think the merge you described makes sense.

DanTup commented 1 year ago

so I think we need to have some way of mapping ranges from one pass back to the ranges that existed prior to the original pass

For the non-overlapping edits, I think this is fairly straight-forward (compute a delta of the earlier edits replacementRange.length vs replacementText.length and offset the later edits by it). I'm fairly sure we already have to handle this case (because server produces sequential edits and LSP does not).

For the overlapped edits, I don't think it's possible to do this up-front. A second edit might be modifying code that didn't exist at the start, so there's no offset in the original document that represents it. But I think we can just handle this as part of normalising the edits to the same range.

I haven't tried coding it yet, but here's what I had in mind...

Given firstEdit and secondEdit there are only three possibilities for the start offset:

They are the same - nothing to do
firstEdit starts before secondEdit
- Reduce offset of secondEdit to match firstEdit
- For however many characters the offset was reduced by, read that many characters from firstEdit's replacementText and prepend it to secondEdit's replacementText. If firstEdit's replacementText didn't have enough characters, the additional characters would come from the original document starting at the end of firstEdit's replacementRange.
firstEdit starts after secondEdit
- Reduce offset of firstEdit to match secondEdit
- Prepend the substring from the new offset to the old offset from the original source into firstEdit's replacementText

Both edits should now have the same offset and applying them sequentially should have the same result. I think we can do something similar for the end. We can expand firstEdit by increasing the replacementRange and appending the additional characters from the source file, and we can expand secondEdit by doing the same (but taking from firstEdit's replacementText first).

It feels sound in my head, but I haven't tried coding it yet. It's possible there are cases I haven't considered or the logic doesn't work. I'll try coding it up soon and see where I get (though if you see any obvious flaws in the meantime, let me know 😄).

bwilkerson commented 1 year ago

I think the logic works if there's exactly two edits to merge, but doesn't work when there are more edits. I'll give you an example below that should explain why I think it doesn't work, but it's possible that your algorithm works for reasons I'm not seeing.

Given firstEdit and secondEdit there are only three possibilities for the start offset:

They are the same - nothing to do

If the firstEdit was produced by a pass before the one that produced the secondEdit, then the offsets might not be comparable. For example, given in put of abcdefghijklmnopqrstuvwxyz, if the first edit inserted 012 at offset 3 and replace 4 characters starting at offset 7 with 34, the result would be abc012defg34lmnopqrstuvwxyz if the second pass inserted 5 at offset 7, then the result would be abc012d5efg34lmnopqrstuvwxyz. Note that the offset of 7 for the first edit is not related to the offset 7 from the second edit.

DanTup commented 1 year ago

Thanks! I've looked at this a few times today, and it's definitely not as simple as I thought... For a single pass, server always produces edits that can be applied sequentially (they go from highest offset to lowest), but with multiple passes we can end up with the offsets having a mix of orders, so some assumptions that I (and the LSP code) made doesn't work.

I'm wondering if something like this might work:

Walk backwards through the edits
For each edit, walk backwards further looking for the nearest edit that intersects with it (computing an offset delta on the way)
Perform the steps above on those two edits, eliminating one of them (expanding edits will require having the computed source for related edit from start to end)
Repeat until there are no more intersecting edits

I'm not completely certain it'd work and it'd definitely be more complex. I wonder if it's worth trying though, as it feels like it still might be the closest we have to a solution that doesn't compromise too much on supporting things like the grouped refactor preview? 🤔

bwilkerson commented 1 year ago

That might be worth trying. I don't have a case that proves that that approach wouldn't work, but I'm somewhat skeptical because the offsets of the edits are still relative to different versions of the files and my intuition is that that's the critical piece that needs to be solved. But maybe the offset delta you mentioned takes care of that in a way that I'm not currently understanding.

Point of clarification: I think the first step is to walk backwards through the current pass' edits, and the second step is to walk backward through the accumulated edits from previous passes. Otherwise I'd be concerned that it would be possible to incorrectly handle some of the edits in terms of which pass they're from.

DanTup commented 1 year ago

I failed at trying to do the above. Tracking the offsets/deltas while working backwards got very complicated.

However, I think I may have a new idea 😄

First step is to go through the edits and re-order them where possible, so get them back into highest-offset-first (as each pass would have them). This is done like insertion sort (we work through them, then try to move each one as far as back as we can), but when we move items, we compensate for the delta of the previous item that's no longer being applied before it. If two edits touch or intersect (such that the swap would be invalid), we do not swap.

Now we have a list of edits in reverse order (as we want) except for those problematic overlapping/touching edits. But those edits are now adjacent, so we can use the original idea from above to merge them.

I have a basic version of this working for the cases I've tried, I'm just trying to get it into the server codebase so I can write tests and ensure all cases are covered.

I don't know how efficient it'll be (O(n^2)?), but I do expect our lists to be mostly sorted, and in the case where the second pass makes no changes we don't need to do any of this - and that's true for each individual file.. if only a single file has edits from two passes, only that files edits need merging. If I can confirm it all works and we can get a good set of tests, we can measure and if necessary, try to optimise.

bwilkerson commented 1 year ago

Sounds promising. Eager to hear how it goes.

DanTup commented 1 year ago

I've pushed what I've got so far here:

https://dart-review.googlesource.com/c/sdk/+/304180

I need to add more tests (deletions, multiple files) before it's ready but since I'm done for the day I thought it was worth pushing in case you want to have a scan of the general implementation (or think of any additional cases to test that I can work on tomorrow). Not urgent though :-)

DanTup commented 1 year ago

Copying my comment from https://dart-review.googlesource.com/c/sdk/+/304340 because it's easier for me to add screenshots (and find discussion later). @bwilkerson @scheglov

I have a concern around performance of this. Because we create new contexts for this and re-analyze, it takes longer even in the case where there are 0 or 1 rounds of changes.

I tested with a random file in Flutter (lib/src/material/scaffold.dart) by introducing code that required several passes of fixes.

Before change:

Pass with no fixes: 66ms
Manual pass 1: (var -> type): 330ms
Manual pass 2: (add final): 248ms
Manual pass 3: (change final to const): 217ms

After change:

Pass with no fixes: 500ms
Single-phase pass: 850ms (this is a pass, a re-analyze, then another pass to ensure there wasn't a second round of changes)
Three-change pass: (this is four passes, since the last one doesn't produce any additional changes): 1.2s

When running on save, this causes VS Code to show a progress notification that it's waiting for this operation to complete before it saves. Going from 66ms to 500ms when there are zero fixes feels quite bad.

We could potentially delay creating the new contexts/overlay provider until we know the first pass produced edits, but the case where there are any edits at all is still 850ms and that's probably not a terribly uncommon case.

Here's some timings of the slow case (which was actually 1.6s this time):

(name: request, count: 1, elapsed: 0:00:01.600182, elapsedSelf: 0:00:00.233026)
  (name: IterativeBulkFixProcessor.fixErrorsForFile, count: 1, elapsed: 0:00:01.365175, elapsedSelf: 0:00:00.006339)
    (name: create context collection, count: 1, elapsed: 0:00:00.034009, elapsedSelf: 0:00:00.034009)
    (name: BulkFixProcessor.fixErrorsForFile 0, count: 1, elapsed: 0:00:00.860647, elapsedSelf: 0:00:00.103095)
      // lots of time here
      (name: getResolvedLibrary, count: 1, elapsed: 0:00:00.757552, elapsedSelf: 0:00:00.757552)
    (name: apply edits 0, count: 1, elapsed: 0:00:00.008639, elapsedSelf: 0:00:00.008639)
    (name: BulkFixProcessor.fixErrorsForFile 1, count: 1, elapsed: 0:00:00.178934, elapsedSelf: 0:00:00.078159)
      (name: getResolvedLibrary, count: 1, elapsed: 0:00:00.100775, elapsedSelf: 0:00:00.100775)
    (name: apply edits 1, count: 1, elapsed: 0:00:00.003440, elapsedSelf: 0:00:00.003440)
    (name: BulkFixProcessor.fixErrorsForFile 2, count: 1, elapsed: 0:00:00.173591, elapsedSelf: 0:00:00.076563)
      (name: getResolvedLibrary, count: 1, elapsed: 0:00:00.097028, elapsedSelf: 0:00:00.097028)
    (name: apply edits 2, count: 1, elapsed: 0:00:00.003125, elapsedSelf: 0:00:00.003125)
    (name: BulkFixProcessor.fixErrorsForFile 3, count: 1, elapsed: 0:00:00.096451, elapsedSelf: 0:00:00.000180)
      (name: getResolvedLibrary, count: 1, elapsed: 0:00:00.096271, elapsedSelf: 0:00:00.096271)
  (name: merge changes, count: 1, elapsed: 0:00:00.001981, elapsedSelf: 0:00:00.001981)

The slowest part (perhaps unsurprisingly) is the first time we fetch the resolved library (triggering analysis) in the new context. This makes me think it's mostly due to recomputing things that already exist in the original context (so if we could reuse more, it might be significantly faster).

Screenshot 2023-05-24 at 15 06 07

DanTup commented 1 year ago

Since the CPU profile wasn't particularly useful to me, I applied my rigged version of OperationPerformanceImpl that writes DevTools timeline events and it looks like this:

Screenshot 2023-05-24 at 16 37 33

computeLibraryCycle in LibraryContext.load seems to take the bulk of the time that's specific to the first analysis in this context (I don't know if that's something that could be reused?).

scheglov commented 1 year ago

I suspect, although the profile should show actual data, that the most time that it spent in computeLibraryCycle is because we build FileStates for transitive libraries. I think that we probably don't link libraries, because they should be already available in the shared ByteStore. We probably cannot reuse FileStates from a different AnalysisContextCollection, especially because it has different ResourceProvider.

DanTup commented 1 year ago

although the profile should show actual data, that the most time that it spent in computeLibraryCycle is because we build FileStates for transitive libraries.

I looked through what computeLibraryCycle was calling, but couldn't see where the FileStates are created. If you can give me a pointer I can make it show up on the timeline or re-run the profiler to see if it shows up.

We probably cannot reuse FileStates from a different AnalysisContextCollection, especially because it has different ResourceProvider.

I'm not familiar with these parts so this might be a silly question, but since our new ResourceProvider is just overlays, would it be possible re-use the state for files that don't exist in it (and are being provided by the underlying resource provider, which is the same)?

Although, even if we completely eliminate that block of code, I still wonder whether the multiple analysis passes will still leave us slower than we'd like to be for something running on-save? 🤔

scheglov commented 1 year ago

although the profile should show actual data, that the most time that it spent in computeLibraryCycle is because we build FileStates for transitive libraries.

I looked through what computeLibraryCycle was calling, but couldn't see where the FileStates are created. If you can give me a pointer I can make it show up on the timeline or re-run the profiler to see if it shows up.

Are we talking about this line? https://github.com/dart-lang/sdk/blob/305fa3d77c534eff0de9def487b184ebf8608b7b/pkg/analyzer/lib/src/dart/analysis/library_context.dart#L245 It invokes https://github.com/dart-lang/sdk/blob/305fa3d77c534eff0de9def487b184ebf8608b7b/pkg/analyzer/lib/src/dart/analysis/library_graph.dart#L16 And it will ask libraryImports here https://github.com/dart-lang/sdk/blob/305fa3d77c534eff0de9def487b184ebf8608b7b/pkg/analyzer/lib/src/dart/analysis/library_graph.dart#L142

This will cause building FileStates for imported libraries, which will become new nodes in the graph, for which we will ask their own dependencies, etc, recursively.

We probably cannot reuse FileStates from a different AnalysisContextCollection, especially because it has different ResourceProvider.

I'm not familiar with these parts so this might be a silly question, but since our new ResourceProvider is just overlays, would it be possible re-use the state for files that don't exist in it (and are being provided by the underlying resource provider, which is the same)?

No, I don't think this will work. AnalysisContextCollection uses a single ResourceProvider, in order to reuse anything, it has to be the same one.

So, maybe instead we should use the AnalysisContextCollection instance that already exists in DAS, and update the existing OverlayResourceProvider on which it is based. We will have to remember original content for files that we are going to update temporarily, and revert them back. We need to make sure that there is no race condition though, e.g. no new overlays asynchronously come from the client while we are running this command.

Although, even if we completely eliminate that block of code, I still wonder whether the multiple analysis passes will still leave us slower than we'd like to be for something running on-save? 🤔

Doing any significant work on save is a bad idea, IMHO.

DanTup commented 1 year ago

And it will ask libraryImports here

Ah, that's what I was missing. I apparently clicked through to import.importedLibrary but ignored container.libraryImports. I'll check this out..

We will have to remember original content for files that we are going to update temporarily, and revert them back.

Hmm, this is an interesting idea. I was trying to avoid it because of async stuff, but perhaps we can use the existing mechanism for locking requests. I'll do some testing, thanks!

Doing any significant work on save is a bad idea, IMHO.

To clarify, this is if the user opts-in to wanting to automatically fix on-save (though we're discussing adding this to the "recommend settings" - but it definitely can't take this long!).

DanTup commented 1 year ago

Hmm, this is an interesting idea. I was trying to avoid it because of async stuff, but perhaps we can use the existing mechanism for locking requests. I'll do some testing, thanks!

Actually, it's not as simple as that. We need to also ensure while we're modifying these overlays we're not firing off events such as new diagnostics, closing labels, etc.

So it would probably end up being something like:

Accept Fix All request
Lock the request queue so no more requests are processed
Ensure all in-flight requests are completed (since they might have been doing async work)
Ensure any in-progress analysis has completed
Prevent the server from sending any events that are usually triggered from analysis (diagnostics, closing labels, Flutter outline)
Perform the iterations of fix all, modifying overlays as they go
Reset all overlays
Wait for in-progress analysis to complete
Re-enable "events" from the server
Unlock the request queue

We'd also need to ensure however this is done, it's not brittle and can't easily accidentally be broken in the future (such as newly-added "events" being sent).

That said, I don't need to implement all of this to measure the change in performance - I can just ensure I don't violate any of those things manually and capture some numbers.

DanTup commented 1 year ago

@scheglov you were right about the time being spent in libraryImports/libraryExports above.

So I changed the code to just manipulate the original overlays from the server (ignoring the other complexity for now). It's faster, but it's still quite slow.

One thing I noticed that might be duplicated work but I'm not sure, is that after we apply the edits to the overlays it triggers "computeAnalysisResults" (because the server thinks a file was just modified and needs to produce diagnostics). This holds up the next iteration a little, which is trying to get a resolved library. When computing analysis results completes, it continues, and then resolves the library.

I'm not too familiar with all the steps inside computeAnalysisResult and getResolvedLibrary but I assumed that computing the analysis result would have to resolve the library, so I'm wondering whether getResolvedLibrary should/could be faster here (because it could be reusing what was just computed)?

Screenshot 2023-05-25 at 15 12 47

scheglov commented 1 year ago

Hard to say for sure. If you mark a file as changed, the driver will clear cached results, and so all resolved units will be recomputed. It should not matter whether you ask for a resolved unit, or a library, some time ago I implemented caching the whole library.

DanTup commented 1 year ago

@scheglov I am marking the file as changed, but only before both of the analysis that occurs on each pass.

I made a small test that I think reproduces the same which may make it clearer. It initialises the server and opens a file (which makes it priority). The file is modified (which triggers analysis) and then afterwards getResolvedLibrary is called. The call to getResolvedLibrary appears to result in another call through LibraryAnalyzer.analyze which I thought would be skipped (because some cache could be used).

I'm not sure exactly which caches exist (except priorityResults) though, so I'm not sure at which point we would expect it to reuse previous results (I couldn't see any obvious cache that was checked between getResolvedLibrary and LibraryAnalyzer.analyze).

https://github.com/DanTup/sdk/commit/dc512f7793ff0e978b747ea5c77669006173f4f7

  @soloTest
  Future<void> test_danny() async {
    instrumentationService.logInfo('Initializing...');
    await initialize();
    await openFile(mainFileUri, 'class A {}'); // Sets priority.
    await pumpEventQueue(times: 5000);

    // Replace file, which will trigger analysis
    instrumentationService.logInfo('Modifying priority file...');
    await replaceFile(1, mainFileUri, 'class B {}');
    await pumpEventQueue(times: 5000);

    // Ask for resolved library which has just been analyzed. Should it be
    // triggering `LibraryAnalyzer.analyze` again?
    instrumentationService.logInfo('Asking for resolved library...');
    await server.getResolvedLibrary(mainFilePath);

    // log output:
    // info: Initializing...
    // info: LibraryAnalyzer.analyze() for /home/my_project/lib/file.dart
    // info: LibraryAnalyzer.analyze() for /home/my_project/lib/main.dart
    // info: Modifying priority file...
    // info: LibraryAnalyzer.analyze() for /home/my_project/lib/main.dart
    // info: Asking for resolved library...
    // info: LibraryAnalyzer.analyze() for /home/my_project/lib/main.dart
    print(instrumentationService.log.join('\n'));
  }

scheglov commented 1 year ago

Ah, right, getResolvedLibrary() results are not cached. This is easier to implement than caching for individual files. I will do it.

scheglov commented 1 year ago

https://dart-review.googlesource.com/c/sdk/+/306302

DanTup commented 1 year ago

@scheglov thanks! I'm still seeing duplicate calls with this change though. It looks like caching is done in _computeResolvedLibrary but I think my first analysis is happening inside _computeAnalysisResult (which also uses LibraryAnalyzer) but I think that it's caching any results.

It looks like _computeAnalysisResult doesn't create a ResolvedLibraryResultImpl so I'm not sure whether it has all the information to cache it?

I tested both my original case (the Fix All command I'm trying to improve) and the test in https://github.com/DanTup/sdk/commit/dc512f7793ff0e978b747ea5c77669006173f4f7 (with your change cherry-picked into my local code), but both appear to be the same.

Is _computeAnalysisResult computing the full resolved library, and if so is it also able to add to the cache?

DanTup commented 1 year ago

I have a possible fix for the above here: https://dart-review.googlesource.com/c/sdk/+/306620

With that, the duplicate analysis is gone and things are much faster at around half a second (for 3 passes). And when there are no fixes to apply, we're down to less than 1ms (because no new analysis takes place, the result from the analysis when the file was last modified is used from the cache).

I haven't implemented locking/reverting of the overlays yet so that might still change things, so I'll sort that and then do some more thorough profiling/timing and w can decide whether it's good enough.

Screenshot 2023-05-31 at 11 52 54

DanTup commented 1 year ago

I've pushed changes to https://dart-review.googlesource.com/c/sdk/+/304340 that now modify the overlays in the original session and revert them. It also locks the request queue and suppresses any notifications from analysis results (diagnostics, closing labels, outlines etc.) so the client doesn't get invalid results while we're temporarily using the overlays.

Using that change and with https://dart-review.googlesource.com/c/sdk/+/306620 cherry-picked in, these are the timings I see when I have pkg/flutter open and I'm applying multiple passes of fixes.

The test code/changes I'm testing with are:

0 round of fixes: const int a = 1;
1 round of fixes: int a = 1; -> const int a = 1;
2 round of fixes: var a = 1; -> const int a = 1;

There is always one more "pass" of looking for fixes than there are actual fixes because we loop until we find no more fixes (up to a max of 4).

I've recorded the time from when the request starts being handled to the point where the edit is about to be sent to the client (which includes having reverted the overlays). I didn't measure the entire executeCommand call because that request awaits the client applying the edits locally, which triggers another overlay modification. Essentially these measurements are the time from the server starting the request to a fraction before the user would see the edits applied visibly in the document.

0 fixes (1 pass): 972us, 687us, 724us
1 set of fixes (2 passes): 243ms, 227ms, 237ms
2 sets of fixes (3 passes): 428ms, 365ms, 361ms

I'm definitely much happier that the request is <1ms in the case where there are no fixes - although it relies on https://dart-review.googlesource.com/c/sdk/+/306620 (so if we can't land it as-is, I hope we can do something similar).

scheglov commented 1 year ago

I slightly reworked your CL, see https://dart-review.googlesource.com/c/sdk/+/306660

scheglov commented 1 year ago

Are these timings for a multiple fixes applied to a single file? How big is the file? Or the whole Flutter repository?

DanTup commented 1 year ago

Whoops, I was supposed to have included that in my comment above. I had only packages/flutter open and was applying fixes inside scaffold.dart.

Right now, we only support Fix All for single files. I do want to expand this to have a "Fix All in Workspace" in future, and when we do we'll need to ensure this still performs well (I think we would be able to apply each round of fixes to the whole workspace, so there would still be only 1-4 passes, but in each pass there could be many more files modified).

DanTup commented 1 year ago

With all the latest changes noted above and in https://dart-review.googlesource.com/c/sdk/+/304340, I'm seeing around 350ms now for the slower case (2 rounds of fixes which involves 3 passes), down from 450ms above (I suspect much of this is attributable to temporarily removing addedFiles).

On my Windows PC it's a little faster at around 250ms.

dart-lang / sdk

Support iterative fixes in the LSP "Fix All" command #47968