Open CyrusNajmabadi opened 2 years ago
Personally, I think this should be an API addition, not a language change. Yes a few more overloads will have to be added to each such collection type, but I doubt those will increase the overall size in any meaningful fashion. If executable size is a concern, then publish-time trimming improvements should be invested into, to reduce the size. The same applies to the indexers - it would have been much less effort for library maintainers to add those overloads, instead of adding a special rule to the language. (Water under the bridge now.)
The same applies to the indexers - it would have been much less effort for library maintainers to add those overloads, instead of adding a special rule to the language. (Water under the bridge now.)
I don't necessarily see how this is true. Users may literally be targetting versions of these libraries that do not have those API additions available.
@TahirAhmadov the primary concern here is adding language features that are intended to help existing apis, but which cannot unless all the apis then create all the duplicate entrypoints for all the new features.
Regarding older versions - for these methods, definitely extension methods should be used. For indexers, I see your point - there are no extension indexers yet. Still, if you look at the big picture, there is an "edict" that older versions of the frameworks are support-mode only; this leads to this situation where adding a few indexer properties in a patch release, which is trivial code to write, is avoided and instead, a (seemingly) much more expensive language change is introduced. I know this isn't as clear cut as it is with methods, but I (if I could make such a decision) would have still favored lifting the "support only" rule for simple additions which support new language features.
Regarding duplication of entry points, that's true but also is it that big of an issue? It's a very easy thing for even the beginner contributors to add; there are only so many collection types in BCL; other library maintainers can decide whether they want to spend the effort to add these overloads, but I would be surprised if there was a big number of collection types out in the wild; I use one custom collection - ObservableList<T>
- which mimics List<T>
and adds events to broadcast modifications - which I use in a few niche cases; 99% of collections are the few mostly useful ones from BCL.
My point is, I can see how the language change has its benefits, but in terms of cost/benefit, it's not worth it. There are scores of items where the language is the solution, and those should be prioritized over saving library maintainers from doing 2 hours worth of copy-pasta.
Regarding duplication of entry points, that's true but also is it that big of an issue?
Yes. You have to find everything and make sure you didn't miss anything.
My point is, I can see how the language change has its benefits, but in terms of cost/benefit, it's not worth it. There are scores of items where the language is the solution, and those should be prioritized over saving library maintainers from doing 2 hours worth of copy-pasta.
I don't see this as two hours of copypasta. This is effectively an ongoing api tax. Every list-like type has to know and maintain/test/support overloads for all these cases, always.
Yes. You have to find everything and make sure you didn't miss anything.
1, I doubt that's a strict requirement. If the most used collections are tackled, we're 90% there, and from there on, we can await requests from the community. 2, finding everything is not that difficult - the same rules that the language uses to determine when to apply this indexing, can be used to search code and find the collection types.
I don't see this as two hours of copypasta. This is effectively an ongoing api tax. Every list-like type has to know and maintain/test/support overloads for all these cases, always.
There is a similar type of cost for the language; it's a parsing rule which has to be maintained/tested/supported forever now.
Also, I'm pretty sure I could bang out this functionality for List<T>
, with tests and all, in about 2 hours. Yes, let's multiply by all the collection types; we're talking about (very generously) maybe 40 hours of work, especially once the template for doing this is in place.
To me, the biggest upside to just adding these new overloads is the ease of the task. While it's not free to add new overloads to types, it's quantitative. Whereas with a language change, it introduces complexity - just like you wrote, one has to carefully design the rule to only allow the desirable use cases. This is risk. New overloads are almost zero-risk.
1, I doubt that's a strict requirement.
It's not strict. But when missed, it just means a friction point. I've run into this myself and it's annoying and feels buggy.
2, finding everything is not that difficult - the same rules that the language uses to determine when to apply this indexing, can be used to search code and find the collection types.
Like i said, it's a tax every must pay in perpetuity. Or we can just detect this once and have all APIs light up here.
I just thought of something else - BCL has many collection types which re-implement what List<T>
does essentially, which date back to pre-generics days. Yes modifying all those MatchCollection
s etc. can be a lot of work, but it also raises the question - is there a pathway to change those to have a common base class (either List<T>
or ReadOnlyList<T>
or maybe a new type)? The way I see it, maintaining all these disparate collection types is the real cost, and changing them to inherit from a common base class (which has all the bells and whistles) would save a lot more than changing the language to work around it.
Implement this for all base IEnumerable types so most like half of enumerated values can benefit over System.Index
feature.
Adding Index support to existing library types (redux)
Summary
Expand on the ability to implicitly support System.Index and System.Range for additional operations.
For example, the following would be allowed:
Which would be rewritten to:
Motivation
Adding index and range support to existing library types made it possible to use System.Index and System.Range in limited cases to support types that didn't know about these new constructs, or which didn't want to update to support them. For example, it was now possible to write either:
These would translate to the following respectively:
However, while this was good for the general indexing case, it didn't extend any further. So lots of sensible methods (like
RemoveAt
orInsertAt
) still require passingint
s and doing the indexing computation manually.This is solvable at the API level by having new overloads be added either as instance or extension methods. However, this may be a lot of work for an API to do. This can be addressed through source-generators. However, that would involve metadata bloat, as well as indirections through both the extension, and the calls to
System.Index.GetOffset
. If those alternatives are unpalatable, a pure language approach might then be the desirable way forward.Detailed design
Given a candidate set of methods, if the methods are not applicable and the methods have a receiver that is Countable and has an
this[int]
indexer.Then:
Index
parameter as having an implicit conversion toint
.Index
arguments that were converted toint
. For those arguments, perform the following translation:Given:
int
, it will be translated toreceiver.Length - expr2
.arg.GetOffset(receiver.Length)
.Note: The receiver and Length expressions will be spilled as appropriate to ensure any side effects are only executed once. Note: The oder of evaluation should follow the same rules specified here.
Drawbacks
Index
s where inappropriate. For example:If this is a concern it could be potentially addressed by requiring that the parameter name be
index
. We could also consider a requirement that the original type (not the instantiated type) wasSystem.Int32
. This would prevent anIndex
being usable just because a generic happened to be instantiated toint
.It may however just not be a concern in practice and we can allow the most generous approach possible.
This does seem nice to support. But it also feels much easier and reasonable for libraries to expose themselves. There are generally orders of magnitude less Range method than Index methods in a cursory exploration of the BCL. Furthermore, it's less clear how such a pattern could necessarily be detected. Unlike the Index approach (which can work by assuming there's an implicit conversion present in some cases), this would not be done with conversions. This would be expansion of one arg to many, similar to how
params
works. Except that there might be many locations in the argument list this could happen, with very little clarity on how to match them up.We could also look for specifically
int index, int count /*or length*/
, but it seems very brittle and difficult to model. Hopefully these are rare enough that just adding overloads in the type itself is a palatable approach.int
and an extension that took anIndex
, then we'd now prefer the former. While this was a break, it might be acceptable under the presumption that the extension existed to supply the functionality the language now provided.Alternatives
This could be added as instance or extension methods to the types themselves. This could be done manually or using extension methods.
Unresolved questions
Design meetings
https://github.com/dotnet/csharplang/blob/main/meetings/2022/LDM-2022-09-28.md#ungrouped