Formatting rule for columnar formatting

kuhlenh commented 6 years ago

Support columnar alignment on assignments and properties/members, for example:

There is a VS extension that provides some of this support but I think this could be an editor option as well as an .editorconfig setting.

I've heard this several times from users conferences and from Twitter.

CyrusNajmabadi commented 6 years ago

Wouldn't it be better to just support "elastic tabs"? i.e.: http://nickgravgaard.com/elastic-tabstops/

This could be something provided at the editor level for all languages, not just C#/VB.

sharwell commented 6 years ago

Design Meeting Notes Following a design review, we decided to initially pursue this indirectly by investigating a new API that allows the formatting operations in the IDE to be replaced by an extension. The primary concerns with direct approaches were the following:

Column formatting is highly subjective, and options face a last mile scaling problem.
The current formatting API is already causing a layering problem in our attempts to provide formatting validation during compilation (#28873). Exposing a new API at a lower layer should be able to accommodate both internal and external formatting feature requests.
Formatter implementations should be able to operate as an adjustment to the default behavior by calling FormatAsync either before or after the logic to modify the format.

Since this API is relevant to work I'm already in the middle of, I'll create a new issue with a proposal for further discussion.

CyrusNajmabadi commented 6 years ago

@sharwell I started looking into this. But i need some help from @dotnet/roslyn-ide or @heejaechang . Basically, i could not figure out where in the stack of the formatting engine either:

explicit code to handle alignment would fit.
an extensibility point could be added that would run at teh appropriate time to be able to handle this sort of thing.

I'd be happy to work on this and contribute accordingly. But right now, i def need some guidance on where to even begin given the currently layering of the formatting engine.

CyrusNajmabadi commented 6 years ago

Basically, my hope had been that i could effectively register syntax-kind (of syntax-node) actions, that could then process the tree at some point (similar to how we register actions for analyzers). However, this didn't really work well because different actions needs to see the results of other formatting steps.

Some features seemed to need to be bottom up. Others top-down. And, running formatters over and over again on teh same tree seemed like a recipe to tank performance.

It did feel like the most appropriate thing was to actually just create a new concept in formatting of an alignment-group. i.e. a set of entities that would be aligned on some construct (like a common token). An analyzer could come in and say that all these should align based on whatever rules it wanted (i.e. as long as there were no blank lines between anything). The formatter could then just do its walk, and appropriately use that alignment metadata, at the appropriate point, to fixup all the trivia.

Thoughts?

heejaechang commented 6 years ago

in the design meeting, we choose to go after different route than expanding our formatting engine to support this since this is just one of thing people want us to support but there are many others people want for their liking.

we will pursue a way to plug in their own formatter on IDE which will run when we used to run. they will be able to just delegate work to ours if they see they don't care the specific formatting request. or they can intercept and they do their own. or they can mix.

...

for this particular case, they can create aligner which greps our formatting results, and finds a certain group of constructs such as local/field/property decls, and align those.

heejaechang commented 6 years ago

if we do this ourselves, doing it 1 pass will be too hard. I would just make it 2 passes where first pass does the formatting it does now, and then second pass, finding those groups, and find base position to align tokens, and have adjust space operations to move base token to right place, and have alignment operations to align all tokens to the base token.

that should get us to the result mentioned above without changing engine to make it done in 1 pass.

CyrusNajmabadi commented 6 years ago

for this particular case, they can create aligner which greps our formatting results, and finds a certain group of constructs such as local/field/property decls, and align those.

What do you mean by 'greps our formatting results'?

heejaechang commented 6 years ago

so, the brainstorm was, we provide a way for custom formatter to get results of previous formatters (let's say we let them to chain formatter), the result could be just a new tree with span or something.

and work on it.

formatter is just special form of code transformation. so at the end, everything is just code transformation with certain restrictions.

what we brainstormed is basically piping code transformers with restrictions that formatter has (whitespace only)

if we ever merge code cleanup with formatter then, everything could just be code clean up chained together.

CyrusNajmabadi commented 6 years ago

and find base position to align tokens, and have adjust space operations to move base token to right place, and have alignment operations to align all tokens to the base token.

Can you explain a bit more what you mean by this?

CyrusNajmabadi commented 6 years ago

so, the brainstorm was, we provide a way for custom formatter to get results of previous formatters

What 'previous formatters' are you referring to here? The existing operations we already apply? Or do you mean the outputs of other 3rd party formatting extensions?

heejaechang commented 6 years ago

think about code cleanup, whatever result of all previous code cleanup made on the tree given.

CyrusNajmabadi commented 6 years ago

This is a bit on the vague side :) i'm trying to get a more concrete idea of what i would be doing. i.e. where in teh existing formatting stack you envision this, and what sort of shape you would want this to have. Once we have that nailed down, we can consider if such an approach would be sufficient for effectively providing a 'column-alignment' formatting extension. Thanks!

CyrusNajmabadi commented 6 years ago

I don't know what this means: "whatever result of all previous code cleanup made on the tree given."

Do you mean hte syntax-tree itself? the new root? the set of edits? the document? Can you dive in a little deeper to get a bit of clarity here? Thanks!

heejaechang commented 6 years ago

okay. let me try to tell you what we talked. again, I am not trying to fight or argue with you. I am really trying to tell you what we talked in the meeting. if the way I say is hard for you to follow, let me tag @sharwell to give you clear idea in a way you can understand.

okay, so we talked about not just this particular bug, but how we can open up formatting to users so that they can plug in formatters that can do formatting however way they would like to.

but we didn't want people to choose ours or theirs. or just one specific formatter.

so the idea was making formatter like code clean up + command handler. people can register as many formatters as they want to. and each formatter will be able to see changes previous formatter made.

just think as our command handler Execute returning tree. and changes are accumulated as what code cleanup does.

I think this is as clearly as I can explain without writing several pages design doc with all details by the way, there is no details set in stone. this was a rough idea and goal is opening up formatting in IDE. and sub goal is not forcing users to completely rewrite whole new formatter just because they want 1 extra thing.

CyrusNajmabadi commented 6 years ago

, I am not trying to fight or argue with you

I didn't htink you were :) I was just trying to get clarity as this is something i'd be interested in working on. However, i just don't understand enough yet to really know how to go about doing that :)

heejaechang commented 6 years ago

so, about using existing formatter to support this one particular case.

this is on top of my head so might not actually work. but i think it probably will work.

first pass should be just formatting things as it is doing now. that will make all tokens to be in its final location.

after than, second pass has 2 sub passes first sub pass will examine span to format to find those alignment groups and pick 1 token (such as biggest column among the group) as base token and other tokens in the group that should be aligned to the base token. second sub pass will provide new formatting rule to formatter and call format. this new formatting rule will do this

GetAdjustNewLinesOperation will return right away.

GetAdjustSpacesOperation will return right away except (previous token of baseToken, baseToken). for the token pairs, it should return AdjustSpacesOperation with space that one wants between them. since all tokens are on its final locations except alignment, one should be able to calculate how many spaces it wants for base token to go to the column it wants. probably with ForceSpacesIfOnSingleLine. if two are not on single line, meaning base token is first token on the line, then, one probably doesn't need this alignment?

anyway, after that, make AddAlignTokensOperation to add AlignTokensOperation with BaseToken above and all other tokens to align to base token with this option - lignIndentationOfTokensToBaseToken

I think this should make the alignment asked to happen with existing system.

but, before actually try, not 100% sure.

heejaechang commented 6 years ago

However, i just don't understand enough yet to really know how to go about doing that :)

if you can tell me what you understood or imagined, then I can say whether you understood or not and figure out what part I need to explain again.

I am not sure what I can do with just I still don't have clear idea or understand.

otherwise, it will be me just repeating myself without going anywhere.

CyrusNajmabadi commented 6 years ago

if you can tell me what you understood or imagined, then I can say whether you understood or not and figure out what part I need to explain again.

A major question i have is:

do we consider the extension points as just transforming a tree-to-a-tree? or
do we consider them as providing things like 'spacing operations' or 'alignment operations'

For either of those. Do we think that we would basically run each extension in order (with some sort of ordering across all of them)?

heejaechang commented 6 years ago

my understanding is we do (1) rather than exposing formatting engine's detail. and yes. in a sequence the same way we do for command handler so that one can decide whether they will intercept (they completely override) or delegate (call the next one in the chain and return result as it is) or mix (call the next one and change the result they returned). we might also support changing tree and then call next one with the tree already changed?.

but again, it is just a rough idea. end goal is so that people can do something like the above without re-implementing whole formatter.

or people can chain formatter that do very specific things like aligning, break lines based on column, put all top level members decl (namespace, class, and etc) on column 0 and etc.

we had more requests but can't remember them all. but all these different styles people want we don't want to provide due to options we had to have to control all these different styles, now people should be able to have that if they want to buy installing those third party formatter.

but those formatter author shouldn't need to care about every detail formatter should do but only ones they care. and can be composed.

CyrusNajmabadi commented 6 years ago

Ok thanks! I'll see what i can whip up. I think it will be necessary then to have a way for people to export formatters which can then be named and ordered. That way we can have a reliable way of pulling these in and executing them in a deterministic order once our actual formatting is complete.

Formatter would then basically get the tree and be allowed to party on it. I'm not sure if we would have some way of ensuring that they only touched whitespace.

--

Note: part of this seems unfortunate. It seems like the formatting engine already has the concept of htings like the AlignTokensOperation. It feels (though i'm not certain if i'm right) that this column-alignment could sit on top of this operation easily. But it would now be up to the extension to have to basically implment all the logic to make AlignTokensOperation work. Is that correct?

heejaechang commented 6 years ago

so, current formatter can align tokens. what it can't do right now is 2 things

it doesn't know which token should be base token other tokens must be aligned to
it doesn't know what tokens go into this "other tokens"

the AlignTokensOperation we currently has was introduced with parameters alignment in mind, not the one we have on this issue.

so, base token and other tokens were assumed to be quite straight forward for rule provider to provide. but this one require some analysis to figure out what is base token is and where that needs to go.

we can always add that support in some where. either engine itself or formatting service which doing 2 pass. but that will be only this case, but not all other cases such as breaking line based on column.

once we have such composible system, we could add aligner ourselves with current format engine with different rule I brainstormed above using AlignTokensOperation.

sharwell commented 5 years ago

Marking as Blocked on #31691

dotnet / roslyn

Formatting rule for columnar formatting #28729