dotnet / roslyn

The Roslyn .NET compiler provides C# and Visual Basic languages with rich code analysis APIs.
https://docs.microsoft.com/dotnet/csharp/roslyn-sdk/
MIT License
19.11k stars 4.04k forks source link

Add an option to apply code formatting to collection initializers #8269

Open HellBrick opened 8 years ago

HellBrick commented 8 years ago

At this moment VS code formatter never touches the collection initializers. I would guess this is done to preserve a hypothetical manual item alignment, but as someone who never does that, I find it very annoying that I have to format the code inside the collection initializers manually. So I'd like to suggest adding an option to specify whether the formatter should format the collection initializers (just like there is Ignore spaces in declaration statements to allow/disallow manual declaration alignment).

GearTheWorld commented 3 years ago

@CyrusNajmabadi I understand 😢

ahmednfwela commented 3 years ago

little trick that I found when formatting dictionary initializers:

so if you have OCD like me and don't care about performance, you can do:

public static class DictExt
{
    public static T[] F<T>(this SortedDictionary<int, T> dict)
    {
        return dict.Values.ToArray();
    }
}

then where you want an array:

var arr = new SortedDictionary<int, int>()
{
    [0] = 5,
    [1] = 5,
}.F();

if you don't want to write indexes yourself:

int _ = 0;
var arr = new SortedDictionary<int, string>()
{
    [_++] = "First Item",
    [_++] = "Second Item",
}.F();
aaron-manning commented 2 years ago

The lack of a feature complete and opinionated formatter built into the language is, in my opinion, one of the biggest draws to other competing languages that do have this feature built in, like golang and rust (and there are many more that have excellent third party formatters, which don't seem to exist for .net).

After having experienced the tooling available for competing languages, it feels absolutely archaic to have to manually tab and fix code formatting in .net 6 Visual Studio 2022, VSCode etc.

Please can we get this feature, it's such a time saver in other languages.

justinmchase commented 2 years ago

Any work around available? I'm very tired of telling people to fix this in code reviews.

CyrusNajmabadi commented 2 years ago

Nope. No work around.

loraderon commented 2 years ago

Any work around available? I'm very tired of telling people to fix this in code reviews.

If you are willing to go the opinionated route (such as prettier) the only way is to use https://github.com/belav/csharpier

justinmchase commented 2 years ago

prettier

I will try using them in conjunction, it seems like it may work. That being said, it seems like this formatter should also cover all of these cases and be similarly opinionated.

Aeroverra commented 1 year ago

Also looking for this feature added. No support for redoing the entire format system though if thats the solution.

AgentFire commented 1 year ago

Also looking for this feature implemented.

Droni commented 1 year ago

Any news?

steve-jokes commented 1 year ago

It's year 4202 now, can't believe it's still a broken feature.

CyrusNajmabadi commented 1 year ago

FYI, in 17.9 we'll have refactorings you can use to quickly fixup an initializer (or collection expression), with a lot of available common-formats that people often do. For example:

image

image

image

image

SetTrend commented 1 year ago

Great!

Yet, unfortunately, it's missing the "leading comma" option:

{ new A { B = 1, C = 1}
, new A { B = 1, C = 1}
, new A { B = 1, C = 1}
, new A { B = 1, C = 1}
}
CyrusNajmabadi commented 1 year ago

Yet, unfortunately, it's missing the "leading comma" option:

Yes, our formatting options are not, and have never been, expected to cover every single permutation of style preferences someone has. If you need something extremely customized, that's what writing your own plugins is for. Our views have not changed here (including for any future expansions we have in this area, or in formatting in general). Thanks!

SetTrend commented 1 year ago

Yet, that's a common formatting, particular in SQL. Moreover, it's the most easy formatting for adding new elements to the collection using copy/paste. Thanks.

CyrusNajmabadi commented 1 year ago

@SetTrend we have never supported, in any construct, a formatting option for commas to work that way.

Note that idiomatic c# would just be to place the comma at the end of each initializer.

Moreover, it's the most easy formatting for adding new elements to the collection using copy/paste. Thanks.

Putting the comma at the end is by far easier :)

SetTrend commented 1 year ago

Putting the comma at the end is by far easier :)

On the contrary: Appending new items to the collection is very error prone and cumbersome with subsequent commas, particularly when copy pasting: You always need to manually add a comma to a middle element and remove the last pasted element's comma.

That's all not necessary when using leading comma formatting 😉

CyrusNajmabadi commented 1 year ago

You always need to manually add a comma to a middle element and remove the last pasted element's comma.

No you don't need to ever do that. :)

These constructs allow, and have always allowed, trailing commas. For this exact reason. :)

See the code in my screenshots:

image

the elements have trailing commas. So copy/paste is trivial. Leading commas are not necessary, desirable, or something we will invest formatting support into.

jeremy-visionaid commented 1 year ago

@SetTrend - There's no need to remove the trailing comma. There's an anaylzer to enforce its presence: https://github.com/DotNetAnalyzers/StyleCopAnalyzers/blob/master/documentation/SA1413.md

justinmchase commented 1 year ago

Yes, our formatting options are not, and have never been, expected to cover every single permutation of style preferences someone has.

Please stick with your guns on this one. It is the way. Make one style, the world would be a better place if everyone just had to use the same formatter. The code would be universally more readable in the end.

Also, regarding the trailing comma wars. Y'all should checkout Bicep it takes an even more interesting approach: no commas allowed, new-lines only.

It ends up being pretty nice and forces uniform set formatting and reduces code noise quite a bit.

GearTheWorld commented 1 year ago

I would see that a little bit like training an AI. You show how you want things done and then when you code, it does what you trained your VS to do. No ?

MisinformedDNA commented 1 year ago

I would see that a little bit like training an AI. You show how you want things done and then when you code, it does what you trained your VS to do. No ?

You could create an AI to do what you want, but that's not what this code is intended to do. This code is following explicitly defined (or default) rules.

GerardDR commented 10 months ago

Someone arbitrarily decided a few years ago on a particular formatting style and the rest of the world has to suck it up forever? Microsoft are notorious for going in one direction and suddenly changing course.

We're NOT just talking about initialisers - this applies to method parameters, to other languages, etc etc, (in Node a leading comma is part of the standard, in SQL it makes a whole lot of sense in Select A, B, C, D ... ).

It’s easy to miss a trailing comma which is often not even in sight (which sometimes will not throw up a compiler error and lead to unexpected results. Depending on the compiler to ignore extra commas at the end of of initialisers is at best a fudge! It's syntactically incorrect and mysteriously being killed during the compile). In C/C++/Pascal accidentally having a semicolon (unseen because it's at the end of a line, or missed because a series of commas are at various columns instead of standing tall at the start) could have disastrous consequences. There are times when a trailing comma is significant and should not be removed, eg when dealing with exports from/to CSV files, eg.:

    _int?[] mySizes = { 1, 2, 3, null };
    for( int i = 0; i < mySizes.Length; i++)
               Console.WriteLine( i +"-" + mySizes[i]);_

If I hadn't put the null in there the output would have been 3 lines instead of 4 because the compiler tried to second guess my intention.

So assuming trailing commas are NOT okay (which it's not for multi-param function calls among many things):

It’s harder to scroll to the end of the previous line, then add a comma there, then jump back to the start of the pervious line to add the next param.

Statistically params are copied/added/commented out way more often for parameters down the list rather than the very first one. Try adding // to temporarily comment out a param (other than the first one) if the comma is sitting at the end of the previous param.

FuncA( parameter1, parameter2, //parameter3 //-- compile error, unless you DELETE the comma on the previous line. )

Now copy and paste parameter2 below it - we have a trailing comma.

If this wasn’t a good idea why are the full stops in method chaining not at the end of the previous command:

var data = authorList .Where(a => a.Country == “USA”) .OrderBy(a => a.AuthodId) .ToList();

CyrusNajmabadi commented 10 months ago

Depending on the compiler to ignore extra commas at the end of of initialisers is at best a fudge! It's syntactically incorrect

This is not the case. As I mentioned a few months ago:

These constructs allow, and have always allowed, trailing commas. For this exact reason. :)

There is no fudging here. And it's very much syntactically correct and intentional.

SetTrend commented 10 months ago

There is no fudging here. And it's very much syntactically correct and intentional.

I might add: "… as a workaround for the issues @GerardDR described."

I cannot recall right now, but I believe to remember that it was not syntactically correct in C# back when initializers were introduced. The trailing comma formatting just lead to introducing this, well … fudge.

CyrusNajmabadi commented 10 months ago

But it's not a fudge or hack. The language was designed this way precisely for linear lists over multiple lines. :-)

GerardDR commented 9 months ago

Well, then fudge has taken on a new meaning - is the word itself being fudged to suit an argument? It's like "Here's something weird in general language constructs, a trailing comma, let's find a way to allow it. And let's not call it a fudge.

The language was not designed this way - it was modified in hindsight to deal with a problem that crept in, nothing to do with coding but everything to do with aesthetics. Surely the lesser and more fully-fledged workaround would have been for Microsoft to allow formatting in more flexible ways, and not touch the language.

More importantly, what happens for method parameters??

CyrusNajmabadi commented 9 months ago

The language was not designed this way

It literally was. I'm one of the language designers. As a very real, very current example, we designed collection expressions to allow for trailing commas for these exact reasons.

Surely the lesser and more fully-fledged workaround would have been for Microsoft to allow formatting in more flexible ways, and not touch the language

Definitely not. We'd only be able to fix up visual studio. But not any other tool. Whereas, by intentionally making the language work this way, one can take advantage of this in any tool.

Here's something weird in general language constructs,

It's not weird though. It's a normal language decision that has been commonplace in languages for literally more than 50+ years.

davidmilligan commented 1 month ago

This issue is currently the # 8 most upvoted open issue in this repo. sigh. Maybe one day, in a bright glorious future, my great grandchildren will get to experience what its like to have their collection initializers automatically indent themselves, or maybe they will just think their programs into existence.

I don't really understand how we get completely new language features like collection expressions, but simple quality of life things like this one just sit around for decades (this issue is itself only 8 years old, but the problem has been around much longer, I remember the days of VS 2010). Much of the discussion in this thread appears to be about how this feature would be so complex and hard to do, and potentially breaking things or causing regressions, hard to test, etc., and honestly I just don't buy it. There's just no way that adding or removing some whitespace is harder or carries more regression risk than implementing a completely new language feature like collection expressions. In my experience, the razor auto-formatter in VS gets broken in some new way just about every time I upgrade VS (btw, I'm not really complaining, just making a point). it can't be that you guys care so much about not breaking one formatter that you won't change one line of code for years, and at the same time break another one every release.

please, please, fix this, please

CyrusNajmabadi commented 1 month ago

I don't really understand how we get completely new language features like collection expressions

The language is unrelated to the tooling.

but simple quality of life things like this one just sit around for decades

Because the order of complexity and difficulty are in completely different realms. A new feature is much cheaper versus changing code that has the potential to break all our users everywhere.

and honestly I just don't buy it.

I don't know what to tell you then. If you're not willing ot listen to people who are explaining the issue, then there is no room for common understanding.

There's just no way that adding or removing some whitespace is harder or carries more regression risk

It carries a regression risk of teams now upgrading and finding all their projects are broken. It's even worse since it's common for people to run with a different set of tools/sdks on the machines/teams/company resources. For example, people run different versions of VS. People have different SDKs on different machines. CI machines often are different from dev boxes. These have to all be in agreement on formatting or projects completely break.

that you won't change one line of code for years,

It's not one line of code.

jez9999 commented 1 month ago

Because the order of complexity and difficulty are in completely different realms. A new feature is much cheaper versus changing code that has the potential to break all our users everywhere.

New features are cheap? Cool, can we get labeled loops soon then? 😄

AgentFire commented 1 month ago

It carries a regression risk of teams now upgrading and finding all their projects are broken. It's even worse since it's common for people to run with a different set of tools/sdks on the machines/teams/company resources. For example, people run different versions of VS. People have different SDKs on different machines. CI machines often are different from dev boxes. These have to all be in agreement on formatting or projects completely break.

I still don't quite get it. How removing or adding a few extra spaces in an area which is always being ignored by any compiler there is is going to break something?

What on Earth could possibly be dependent on a space characters count in a collection initializer code block, while at the same time being completely independent of the very same thing on any other type of code block? (Including all the new ones in each and every update).

Droni commented 1 month ago

I can give you one lifehack: add an option. The formatting engine has very complex functionality. And it is simply impossible to take into account all the wishes by default. Therefore, having a large number of options for fine-tuning is simply necessary. My former DevExpress CodeRush team has long solved this problem, as well as the Resharper team. I sincerely do not understand why you still can not solve it ...

aaron-manning commented 1 month ago

A fully featured inbuilt formatter would be far higher value to me than literally anything released by the team in .net 8 or upcoming .net 9.

I have to format code hundreds of times per day. Fixing formatting costs me considerable time and is distracting and frustrating. It also costs time during PR review because the formatter is functionally incomplete and there's no way to auto enforce a set style for code that gets ignored by the formatter.

It's almost 2025, we are about to have a ton of new features and performance enhancements released in .net 9 yet we still can't format our damn code properly.

I'm grateful of all the work by the team over the years, but this seems like an oversight that will bleed developers over time.

Dev experience needs to be priority number 1.

raffaeler commented 1 month ago

The problem comes from the fact that Visual Studio over the years implemented its own formatter without contributing back to this repo's formatter. But now even VS does not support all the cases and we still have to manually adjust the code. As a result, it's extremely difficult to fill the gap with a single PR (I contributed to the Roslyn formatter code in the past and it was extremely time-consuming).

CyrusNajmabadi commented 1 month ago

New features are cheap? Cool, can we get labeled loops soon then?

If the language design team thinks it would be good for the language to have then yes. I'm on that team, and currently i don't believe there is a view that this would be a beneficial change. We already have 'goto' and that's what 'goto' is for. So we think that's a suitable approach already.

CyrusNajmabadi commented 1 month ago

I still don't quite get it. How removing or adding a few extra spaces in an area which is always being ignored by any compiler there is is going to break something?

A change to the formatter now causes different SDKs to format the same code differently. This now breaks users on different machines as well as different CI systems.

CyrusNajmabadi commented 1 month ago

I can give you one lifehack: add an option. The formatting engine has very complex functionality.

Every option doubles the testing matrix and the number of cases that can now have regressions. Now you not only have to worry about regressions in the places without options, but also what regressions may happen in every permutation/combinatino of all those options with all combinations/permutations of code constructs used together.

CyrusNajmabadi commented 1 month ago

A fully featured inbuilt formatter would be far higher value to me than literally anything released by the team in .net 8 or upcoming .net 9.

Your feedback is heard. But we have to evaluate that against the feedback of thousands of companies, and millions of other devs. We get that some people really want this. But our efforts are focused on areas where there is much more need/interest from the ecosystem as a whole.

Dev experience needs to be priority number 1.

Everyone has a different view on what priority #1 is. We could focus on this, and tons of other devs would then say that something else is pri1 and we should be working on that.

For example, we get absolutely much more feedback on things like performance, source generators, and the like than we do here. Several orders of magnitude more feedback. Those people are similarly saying things like "performance needs to be priority number 1" etc. Ultimately, we have to put our efforts where we will benefit the most people, and that's simple not here.

CyrusNajmabadi commented 1 month ago

The problem comes from the fact that Visual Studio over the years implemented its own formatter without contributing back to this repo's formatter.

This statement makes no sense. VS does not have its own formatter. VS simply asks the language server for each language to format its langauge. For C#/VB that is roslyn. That formatter is open source and is contained in this repo.

As a result, it's extremely difficult to fill the gap with a single PR (I contributed to the Roslyn formatter code in the past and it was extremely time-consuming).

You can fill the gap with PRs to this repo. That will update hte language server that then ships in VS/VSCode and the .net sdk.

The problem is not there. The problem is that ensuring formatting changes do not cause any breaks is substantially difficult. And if we break anything it breaks everyone who uses the sdk and has any sort of code formatting enforcement on anywhere. Worse, it means you can now be in an inconsistent state where even being on a slightly minor different version of an SDK can leave you in an unpalatable position where you are getting different formatting errors. You fix on one machine, only to see it break on another.

This is not speculative. This has happened several times, even with changes people thought were safe and would not impact anyone.

And, once we ship behavior for anything, we can't change it. Which means that with any new option, we have to validate and write tests for every permutation of that option with every single other option and every language feature. And we must never break that from that point on. This causes the number of tests to blow up enormously as more and more options are added.

CyrusNajmabadi commented 1 month ago

My former DevExpress CodeRush team has long solved this problem

CodeRush does not ship in the .net sdk. And when it gets things wrong, people's builds aren't broken.

aaron-manning commented 1 month ago

For example, we get absolutely much more feedback on things like performance, source generators, and the like than we do here. Several orders of magnitude more feedback. Those people are similarly saying things like "performance needs to be priority number 1" etc. Ultimately, we have to put our efforts where we will benefit the most people, and that's simple not here.

A complete formatting solution is surely becoming something that is expected of every programming language considered "modern" nowadays. It is something that affects literally every developer, from novice to pro, we all need to format our code, and the more code you write the more formatting needs to be done.

Performance issues are domain specific (and overall perf is already excellent), source generators are currently niche.

In my opinion, the fact that the formatter is not feature complete (because it doesn't handle all the language constructs) is effectively a bug. A bug that has existed for a very long time.

An inbuilt formatter is a feature expected to work without caveat, in the same way that the compiler is expected to compile our code without caveat (given correct code). When a new language construct is added , it's expected that the formatter also formats it.

It doesn't make sense to me to be spending years adding new features and improving performance when there is a bug like this in the core feature set of .net that affects literally everyone.

I understand it may be extremely complex to solve and that spending time on other requests may be more appealing. I don't envy being presented with this issue and I appreciate all the work everyone here does.

Thanks

raffaeler commented 1 month ago

The problem comes from the fact that Visual Studio over the years implemented its own formatter without contributing back to this repo's formatter.

This statement makes no sense. VS does not have its own formatter. VS simply asks the language server for each language to format its langauge. For C#/VB that is roslyn. That formatter is open source and is contained in this repo.

Did this change over time? I remember me and you having a conversation about this to format auto properties and you told me that VS used a different code to format the sources. Also, i am still unable to format the code as VS do (which still does not cover all the cases).

The problem is not there. [...] I am aware of the difficulties in maintaining this code and totally share your points about the fences that must be put to keep the codebase consistent and stable. On the other side I see:

  • The more than obvious need to decently format the code (C# code generators pushed this need exponentially)
  • Objective difficulties in putting our hands in the formatter as the codebase it's definitely not trivial. Maybe the problem should be addressed differently, by providing some low level infrastructure that makes the formatter easier to fix over C# versions (this is just a thought).
CyrusNajmabadi commented 1 month ago

Did this change over time?

No. It has always been this way.

I remember me and you having a conversation about this to format auto properties and you told me that VS used a different code to format the sources. Also, i am still unable to format the code as VS do (which still does not cover all the cases).

I don't know what you're referring to. The c# formatter exists in the Formatter class and is available for to you format with. This is the Formatter used by our language server in VS.

CyrusNajmabadi commented 1 month ago

Performance issues are domain specific (and overall perf is already excellent),

It may be excellent for you. But you don't speak for our community. We agree information from an enormous number of sources, including thousands of companies an many more developers. Perf is absolutely not excellent, and we have the data demonstrating this.

Formatting may be more important to your than performance. But it's not for all the groups saying the opposite. And that's just life.

source generators are currently niche.

Currently not. We have far more adoption of SGs than expected and bugs and issues there are causing much more pain and feedback across our ecosystem.

Furthermore, we have to measure impact. When an SG has a problem, it can be crippling for a business. Save when perf is bad and you can't even use your tools. This is nothing remotely like "an initializer is not getting formatted". In that case, you just edit it how you like. Sure, you don't want to. But your business and devs don't grind to halt over the issue.

Again, I get that you believe formatting sound be more important than these things. We do not. That is unlikely to change.

If formatting is this important to you though, we welcome PRs.

CyrusNajmabadi commented 1 month ago

In my opinion, the fact that the formatter is not feature complete (because it doesn't handle all the language constructs) is effectively a bug

Your opinion is valid. But it doesn't change things here. A bug isn't fixed because it is old. A bug is fixed because the value to the ecosystem warrants the effort involved to fix it, and especially that the "fix" won't make things worse (for example through regressions). :-)

As you can see from the repo itself, we have around 3500 spread over all of Roslyn. Some of them much older than this one. We will take community contributions on all of them. However, most of them are in our backlog and are unlikely to be prioritized unless something changes with the market and ecosystem.

CyrusNajmabadi commented 1 month ago

the same way that the compiler is expected to compile our code without caveat

Interestingly, we have exactly those caveats. We have cases of correct code that does not compile (but is infeasible to fix). We also have cases of incorrect code compiling a certain way. We also can't fix that as people have now taken dependencies on that.

Compiler team is in a very similar position here where they will almost never take breaking changes. We only give ourselves a short window (a couple of months) after a compiler release to make those changes (and only if really really worth it) because we don't want to break people. After that point, the bar goes up ludicrously high, and we've basically only done it once or twice.

I get this is a requirement for you. It is not for us. This is a feature request to support more constructs. Currently, the cost of doing that is too high and risky to warrant doing. Even if it was considered a bug, those costs would be the same and we would be in the same position.

If you are interested in contributing, let us know. Thanks

It doesn't make sense to me to be spending years adding new features and improving performance when there is a bug like this in the core feature set

That's how triage and prioritization works on every project. We have limited resources and thousands of buys and other issues that could take attention. We determine which are the best use of our time for three ecosystem and we with accordingly. This necessarily means that some issues will not make the cut. That's life.

Again, this is open source, and we take contributions. If this isn't meeting our bar, but it is so critical for you, consider contributing.

. I don't envy being presented with this issue and I appreciate all the work everyone here does.

Fortunately, you're not dependent on this only being solved by the team. I get that this is critical in your mind. We'd welcome your efforts if you then wanted to present this issue to yourself to address:-)

AgentFire commented 1 month ago

Fortunately, you're not dependent on this only being solved by the team. I get that this is critical in your mind. We'd welcome your efforts if you then wanted to present this issue to yourself to address:-)

As far as I could tell, your take on this issue can be summarized as two key points:

  1. The resolution of the issue will bring you low value
  2. The resolution of the issue will cause breaks on users' ends.

I, as an abstract guest to this repository, may indeed have my own interest in diving into the sources and maybe producing a PR for the issue.

I even find it encouraging that you mention it. However, at the same time, it's confusing that you also mention that formatting collection initializers will break things on user's end. This naturally rises the question of whether it would worth any effort put in fixing the issue, when the dev team could just discard it as "a good fix, but we explicitly planned on not fixing it"?

SetTrend commented 1 month ago

@CyrusNajmabadi:

You fix on one machine, only to see it break on another.

I can very much follow you on that. However, how do you think about improvements being halted only because legacy code may break? We are using strong versioning in all our applications. If a new version is breaking old code, why is it so hard to say: "Stick with the old version, and your program will still run as-is, or migrate your program to the new situation". Azure and Microsoft 365, for example, are doing things like this all the time.

Again, this is open source, and we take contributions. If this isn't meeting our bar, but it is so critical for you, consider contributing.

Do you also hire? I don't think users can always be expected to spend their free time on amending third-party products.

CyrusNajmabadi commented 1 month ago

However, how do you think about improvements being halted only because legacy code may break?

Not doing things because of risk and impact on users is a normal and expected part of development. We have made compatibility commitments, and we need to honor those.

why is it so hard to say: "Stick with the old version, and your program will still run as-is, or migrate your program to the new situation"

Because that's worse for our ecosystem. We don't want roadblockers. Most people would not want to run into blocks on upgrade because of formatting issues. We know this. Because when we've broken formatting we hear about it from tons of people who are now super unhappy that their builds are not working and things are breaking inconsistently across their org. It is the much smaller group of people that are actually so unhappy that they are ok with potentially breaking others and making them feel pain vs living with the existing experience.

I don't think users can always be expected to spend their free time on amending third-party products.

I didn't say users were expected to do this. I said this was an option available if one was interested.

As we have made very clear, or own assessment is that this is on the backlog. So if you do want it addressed, the best path forward is if it is picked up by the community.