fsprojects / fantomas

FSharp source code formatter
https://fsprojects.github.io/fantomas
Other
772 stars 194 forks source link

Add FormatASTRangeAsync to API. #454

Closed jindraivanek closed 4 years ago

jindraivanek commented 5 years ago

https://twitter.com/k_cieslak/status/1156219821387931650

Suggested type: ast:ParsedInput * fileName:string * selection:range * source:SourceOrigin * config:FormatConfig * projectOptions:FSharpParsingOptions * checker:FSharpChecker -> Async<string>

Why? Parsing source code to AST is expensive, we can avoid it by sharing AST.

I think we can do it by use formatWith here https://github.com/fsprojects/fantomas/blob/trivia/src/Fantomas/CodeFormatterImpl.fs#L594 with AST node covering selection range.

nojaf commented 5 years ago

Ok, should we not also capture the original source for trivia parsing? And have a way to do what we now do without parsing additional AST?

jindraivanek commented 5 years ago

This contains also original source code. Idea of this method is "format this selection of this code and here you have its AST so you don't have to create it".

nojaf commented 5 years ago

My bad, didn't see the source:SourceOrigin part 😋

nojaf commented 5 years ago

Also, projectOptions:FSharpParsingOptions would then contain all used defines?

jindraivanek commented 5 years ago

Hmm, good question. When format is used FSharpParsingOptions.ConditionalCompilationDefines is ignored because we do multiple parsing for different defines. But AST is created for some specified defines.

So, I guess projectOptions must contains defines that was used when AST was created.

nojaf commented 5 years ago

My thoughts exactly, perhaps we should add the defines explicit as a parameter. Any thoughts @Krzysztof-Cieslak?

nojaf commented 5 years ago

In order for both Rider and FSAutoComplete to format the code I believe two key questions are relevant:

If the file does not have any dead code, we should be able to re-use the untyped AST and collect the trivia from the source text. When there is dead code, we will need multiple AST trees in order for the formatting to behave correctly.

At this point, I assume the editor knows that there is dead code or not and calls Fantomas one way or the other based on this information.

In the scenario where there is no dead code and we currently don't have any option to format a range based on AST. So the title of this topic still seems relevant.

Any thoughts or concerns at this point? @jindraivanek @auduchinok @Krzysztof-Cieslak

auduchinok commented 5 years ago

@nojaf No objections from me as everything is how we've discussed it. :)

jindraivanek commented 5 years ago

@nojaf Thanks for great summary.

I think question is if there is some notable perf win in re-use AST vs. re-use of Checker. Is this worth the effort?

nojaf commented 5 years ago

As I recall from our conversation with @Krzysztof-Cieslak , having the re-use of the Checker would be beneficial. In the current (2.X) api when you are using the overload where no checker is passed on to Fantomas, it will use its own checker so I believe we want to avoid that. Unfortunately I can't say anything more relevant than "believe" here. No idea really.

As for the re-use of AST, are we certain that the AST is always in sync the latest keystrokes of the user?

jindraivanek commented 5 years ago

having the re-use of the Checker would be beneficial

Yes.

My question is if re-use of AST would be significantly more beneficial than re-use of Checker.

it will use its own checker so I believe we want to avoid that

Well, this has advantage when Fantomas is used as library, its checker is reused, so consecutive calls are faster.

As for the re-use of AST, are we certain that the AST is always in sync the latest keystrokes of the user?

That would be responsibility of user of this API.

Krzysztof-Cieslak commented 5 years ago

As for the re-use of AST, are we certain that the AST is always in sync the latest keystrokes of the user?

In case of FSAC, yes, for the (untyped) AST in current file we are always in sync with latest keystroke.

My question is if re-use of AST would be significantly more beneficial than re-use of Checker.

I think the value of reusing Checker is the fact that you can handle the logic regarding multiple-parsing yourself. As far as I remember you've mentioned you may need multiple AST from single file (in case of #if etc) - in FSAC we only track single AST (for current defines) for the file. It would be nice if we don't have to re implement this logic in every client ;-)

nojaf commented 5 years ago

I could be wrong but if we re-use the checker it might be ok to parse the AST again anyway as it looks like it is cached internally.

jindraivanek commented 5 years ago

re-use the checker it might be ok to parse the AST again anyway as it looks like it is cached internally

This is exactly where my question was aiming :) But maybe we want to do some benchmark to be sure.

nojaf commented 4 years ago

I think this can be closed now, we can always discuss if anything new should be added to the public API.