dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.86k stars 4.62k forks source link

[API Proposal]: make interface of Array and List classes more uniform #65677

Closed andrei-faber closed 2 years ago

andrei-faber commented 2 years ago

Background and motivation

At the current moment, there are many methods in Array and List that perform same tasks but have to be called in different ways. For instance, Sort() and Reverse() are static methods in Array and instance methods in List. i.e.

Array.Sort(array);

vs

list.Sort();

API Proposal

namespace System
{
    public abstract partial class Array : ICloneable, IList, IStructuralComparable, IStructuralEquatable
    {
        public static void Sort<T>(T[] array)
        {
        }
    }

    public static class ArrayExt
    {
        public static void Sort<T>(this T[] array)
        {
            Array.Sort(array);
        }
    }
}

API Usage

// old usage
Array.Sort(array);
// new usage
array.Sort();

Alternative Designs

No response

Risks

I don't see any

ghost commented 2 years ago

Tagging subscribers to this area: @dotnet/area-system-collections See info in area-owners.md if you want to be subscribed.

Issue Details
### Background and motivation At the current moment, there are many methods in Array and List that perform same tasks but have to be called in different ways. For instance, Sort() and Reverse() are static methods in Array and instance methods in List. i.e. ```C# Array.Sort(array); ``` vs ```C# list.Sort(); ``` ### API Proposal ```C# namespace System { public abstract partial class Array : ICloneable, IList, IStructuralComparable, IStructuralEquatable { public static void Sort(T[] array) { } } public static class ArrayExt { public static void Sort(this T[] array) { Array.Sort(array); } } } ``` ### API Usage ```C# // old usage Array.Sort(array); ``` ```C# // new usage array.Sort(); ``` ### Alternative Designs _No response_ ### Risks I don't see any
Author: andrei-faber
Assignees: -
Labels: `api-suggestion`, `area-System.Collections`, `untriaged`
Milestone: -
Symbai commented 2 years ago

I don't see any

I do, top voted question on stackoverflow will be Whats the different between .Sort() and Array.Sort(). Because if a user searches the internet on how to sort an array, all articles will say Array.Sort(). Also adding an extension which only saves a few chars to write seems a bit odd.

andrei-faber commented 2 years ago

@Symbai and they still would be able to use that method, so I don't see any problem here. Better to mark it obsolete, though. It's not about saving chars, it's about uniform interfaces.

Symbai commented 2 years ago

and they still would be able to use that method, so I don't see any problem here

I never said they would not be able. I said that its not clear to new users which one they should use and what's the difference between them.

Better to mark it obsolete

That would indeed no longer cause confusion. But the chance that the .NET team will do that is zero. Because this exist for so long time, it would break an insane amount of applications (those which no longer build on these warnings). And the only argument you have is to unify things. That doesn't seem valuable enough, at least to me although my opinion is not that important.

It's not about saving chars

In your proposal the extension method is just a call to Array.Sort(). So in the end you say typing Array (5 chars) and that's it.

andrei-faber commented 2 years ago

@Symbai

I said that its not clear to new users which one they should use and what's the difference between them.

Everything is unclear to new users. They need to learn, that's the only answer.

Because this exist for so long time

I can't agree that it's a good reason to not fix problems.

those which no longer build on these warnings

That's why fail-on-warning is used, in the first place. To remove obsolete methods, for instance.

And the only argument you have is to unify things.

Writing quality code is a very good reason.

AraHaan commented 2 years ago

I would rather have Array inherit from IArrayList and have that interface define Sort, Reverse() etc that are similar to each other and have them both "implement" it.

Also I think the interface can contain the actual implementation if I remember right.

NN--- commented 2 years ago

Alternatively today you can solve it using external library such as CodeJam.ArrayExtensions.Sort

eiriktsarpalis commented 2 years ago

I can't agree that it's a good reason to not fix problems. Writing quality code is a very good reason.

I'm curious to understand why you think it's a problem, or why using a static sort method impacts code quality.

andrei-faber commented 2 years ago

@eiriktsarpalis same tasks should be done in same way in types with similar functionality.

eiriktsarpalis commented 2 years ago

Honestly I don't think consistency between two methods would justify the cost of having two ways of doing the same thing for arrays and/or obsoleting a widely used method. I also don't think the inconsistency is a problem: doing a search for "C# sort array" immediately reveals the answer.

andrei-faber commented 2 years ago

@eiriktsarpalis normally, you don't do a search how to do something. You just remember it. Looks like you prioritize experience for students rather than for professional users.

eiriktsarpalis commented 2 years ago

normally, you don't do a search how to do something. You just remember it. Looks like you prioritize experience for students rather than for professional users.

I'm not sure what that is supposed to mean. Either you look something up or you remember it from a previous time that you looked it up. How does your proposal change that dynamic? For the record, I consider myself to be professional yet I still need to look up most of the APIs I'm using.

andrei-faber commented 2 years ago

@eiriktsarpalis when same things are done in different ways for these two classes, you have twice as many things to remember, or you have to do extra guessing/searching work when the approach that you're trying doesn't work for another class. Unnecessary waste of time and brainpower.

Also, let's consider learning curve for the students. First they learn how to do sorting for arrays (and remember it, if they're good students). Then they try to apply the same approach to lists and it doesn't work, so they have to search for it again and remember this different approach too. Again, unnecessary waste of time and brainpower.

theodorzoulias commented 2 years ago

First they learn how to do sorting for arrays (and remember it, if they're good students). Then they try to apply the same approach to lists and it doesn't work.

@andrei-faber so you say that the student will search for a non-generic List class having a static Sort method, and they will be surprised that there is no such class in .NET?

List.Sort(list);

Why aren't you suggesting this API then, and instead you're suggesting an extension method for T[]?

andrei-faber commented 2 years ago

@theodorzoulias I say that they will search for a static Sort in the generic List class, and won't find any. Pretty much nobody uses the non-generic ArrayList, and that's a good thing.

Why aren't you suggesting this API then - why would you want to change the new code to conform to the old and badly-styled one? List<T> was created later than arrays.

theodorzoulias commented 2 years ago

@andrei-faber I am not talking about the ArrayList class, but about a hypothetical static List class that would contain static methods for List<T> instances. Much like the Array class. Btw the Array class contains 32 static methods for array instances. It's not only the Sort, it's also the Clear, Copy, Exists, Find etc. Do you believe that all these methods should be obsoleted and replaced by extension methods in the ArrayExt class?

why would you want to change the new code to conform to the old and badly-styled one?

If you think that the Array.Sort(array) is an example of badly-styled code, why would you expose the students initially to such code? First teach them about how to sort lists, and introduce them later to the concept of arrays. Preferably after warning them that they are going to see a style of code that is old and deprecated, but it's still in use today for historical reasons.

andrei-faber commented 2 years ago

@theodorzoulias I don't know how you got 32, because I get more when I count all the overloads and less when I don't. But yes, that would be useful.

First teach them about how to sort lists, and introduce them later to the concept of arrays. Preferably after warning them that they are going to see a style of code that is old and deprecated, but it's still in use today for historical reasons.

My point still stands in this case. One way or another, it's unnecessary waste of time and brainpower.

theodorzoulias commented 2 years ago

@andrei-faber 32 are the methods that are shown in the left column in the docs, in the TreeView under the "Methods" node. After closer inspection not all of them are static methods. There are 22 static methods and 10 instance methods:

Static

Instance

From my experience none of these methods are used very often by application developers, because arrays are not flexible data structures, and for general use lists are more convenient. I am not sure that I can quantify precisely the brainpower required in order to learn and remember that arrays in .NET are equipped mainly with static instead of instance methods, but I guess that the number of neurons and synapses involved should be pretty small. This number could be even zero for those developers who never had to deal with arrays in their programming career, because the List<T> data structure covers all their needs.

AraHaan commented 2 years ago

Agreed, I never stumbled over a case where I could use an array over an more flexible list, especially if I wanted the ability to be able to pop individual items inside of such. Let's use byte for example, usually I use List<byte> for when:

Overall even for me List<> or any other type of collection would work better for me than normal arrays as well.

andrei-faber commented 2 years ago

@theodorzoulias From my experience none of these methods are used very often by application developers - in this case it's even easier to fix them.

I guess that the number of neurons and synapses involved should be pretty small

A bad programmer John made an error in the code, because of which each user had to spend at least 15 minutes to look for the workaround of this problem. The number of users was 10 million people. Which means, that 150 million minutes were wasted, which equals 2.5 million hours. If a person sleeps 8 hours a day, then he has just 16 hours of conscious activities. So, John destroyed 156250 man-days, which is about 427.8 man-years. The average persons lives 64 years, which means that John killed about 6.68 people. Do you sleep well, John - a serial programmer?

theodorzoulias commented 2 years ago

@andrei-faber the damage caused by John, who is mostly innocent by the way, could be prevented if there was an analyzer available that could recognize the wrong pattern in his code, and alert him by issuing a compile-time warning. Unfortunately this analyzer was not available at the time, because Microsoft had prioritized retrofixing some rarely used array APIs. Those changes were deemed important for mostly aesthetical reasons. So what killed those 6.68 virtual people was the decision to prioritize those changes, over implementing the analyzer that would prevent the world-wide damaging error to occur.

My point is that whenever you choose to do something, you lose the opportunity to do something else. So you don't just have to prove that changing the way we use the arrays in .NET is beneficial. You also have to prove that the benefit outweighs (a) the cost of doing the change (which is non-trivial), plus (b) the opportunity cost of not doing something else instead.

andrei-faber commented 2 years ago

@theodorzoulias the damage caused by John, who is mostly innocent by the way, could be prevented if there was an analyzer available that could recognize the wrong pattern in his code - you're trying to compare things that have development costs different by 3 orders of magnitude, at least. Development of code analyzers is extremely difficult, especially those that solve more problems that create.

which is non-trivial - the fix that I proposed is absolutely trivial. One day of work for a competent programmer, and it's not even necessary to obsolete the old functions.

AraHaan commented 2 years ago

Oh instead I would prefer the old functions to be slated for removal in .NET 8 to enforce cleanup.

Why? When you upgrade .NET it expresses intent that you intend to update your code, like many BCL apis that were marked obsolete before and told you to "Stop using this, use x instead." this would be something similar to that for .NET 7 (I am talking about you BinaryFormatter).

Then in .NET 8 if they have not fixed those issues, they would be forced to as then those functions they called would be forcibly removed to clean out old, dead, and badly designed code that can detriment performance.

I guess if I ever become an area owner, I probably would argue something like this as well too. But then again, I feel like the methods we talked about in this issue (part of Array and List) could be removed from the types themselves and become extensions using Span<T>'s internally that call into the actual impl accepting spans making the extensions doing only the call into the internal impl (since Array's and List's contents can be made into Span's) and basically have it implemented only 1 time in the BCL increasing performance and reducing some maintenance cost.

theodorzoulias commented 2 years ago

The fix that I proposed is absolutely trivial. One day of work for a competent programmer, and it's not even necessary to obsolete the old functions.

@andrei-faber How many Microsoft employees does it take to change a lightbulb? This blog post was written by Eric Lippert nearly 20 years ago, but I guess it's still relevant:

That initial five minutes of dev time translates into many person-weeks of work and enormous costs. [...] At Microsoft we try very, very hard to not release half-baked software. Getting software right [...] is rather expensive! But we have to get it right because when we ship a new version of the script engines, hundreds of millions of people will exercise that code, and tens of millions will program against it.