dotnet / docs

This repository contains .NET Documentation.
https://learn.microsoft.com/dotnet
Creative Commons Attribution 4.0 International
4.25k stars 5.89k forks source link

The collection expressions article in the language reference does not mention a lot of information #39586

Closed SENya1990 closed 7 months ago

SENya1990 commented 8 months ago

Type of issue

Missing information

Description

Dear MS Doc Team,

Recently I had a discussion about Collection Expression C# language feature in the corresponding github item dedicated to the feature: https://github.com/dotnet/csharplang/issues/7913#issuecomment-1950302645

During my work with the feature I have discovered two scenarios when feature didn't work as expected. After discussion with the compiler team members it turned out that there are crucial pieces of information that the documentation for the Collection Expressions feature never mentioned, or did it in a very implicit way. I will list them here:

  1. The article does not explicitly state that the collection expression operators materialize the collection. This is a crucial information because:

    • When collection expressions are used with IEnumerable<T> type it forces the collection materialization and breaks the lazy evaluation. This is important to know for developers that may consider using new syntax instead of LINQ methods like Concat, Prepend, Append for brevity.
    • Continuing the previous point. Collection expressions can't be used on infinite IEnumerable<T> sequences. The application will hang due to an attempt of the compiler generated code for collection expressions to materialize an infinite sequence.
    • The materialization affects the performance.
  2. The compiler team explicitly stated that the Collection Expressions feature and LINQ have different use cases and scopes: https://github.com/dotnet/csharplang/issues/7913#issuecomment-1950292510

This is not covered by the documentation, it does not actually mention LINQ at all. There is no description of the relationship between Collection Expressions and LINQ, and when developer should pick one over another. You may find more about the difference here: https://github.com/dotnet/csharplang/issues/7913#issuecomment-1950514713

  1. The reported article does not specify that collection expression can be converted directly to an interface type like IEnumerable<T> directly. In this case compiler will generate a new read only wrapper type for an array or list that will implement several standard collection interfaces. You can read about it in the feature specification: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-12.0/collection-expressions#interface-translation

  2. In the feature specification and documentation for C# 8 ranges and indexes feature (https://learn.microsoft.com/en-us/dotnet/csharp/tutorials/ranges-indexes#type-support-for-indices-and-ranges) a new classification of .Net types is introduced - Countable types. These are types with integer Count or Length properties with an accessible property getter. They are used by the compiler with duck typing - these types do not have to implement collection interfaces with a Count property, the compiler would use their Count/Length property anyway.

The reported article does not mention that the Collection Expression feature uses Countable types and duck typing. This is an important thing to know for developers, especially in case of specific collection that is not correctly supported by the Collection Expressions feature. You can see an example of such collection here: https://github.com/dotnet/csharplang/issues/7913#issuecomment-1950297412

In addition, in my opinion the concept of a Countable type deserves its own dedicated article because:

  1. The feature specification (https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/proposals/csharp-12.0/collection-expressions#spec-clarifications) provides a notion of a well behaved collection type. Here is the description:

Collections are assumed to be well-behaved. For example:

  • It is assumed that the value of Count on a collection will produce that same value as the number of elements when enumerated.
  • The types used in this spec defined in the System.Collections.Generic namespace are presumed to be side-effect free. As such, the compiler can optimize scenarios where such types might be used as intermediary values, but otherwise not be exposed.
  • It is assumed that a call to some applicable .AddRange(x) member on a collection will result in the same final value as iterating over x and adding all of its enumerated values individually to the collection with .Add.
  • The behavior of collection literals with collections that are not well-behaved is undefined.

These restrictions are important to know for developers using the Collection Expressions feature. They are not mentioned in the reported article, and it is hard to find them in the text of a feature specification. Feature specifications also have explicit statements that the actual implementation may differ from them which undermines their value. Therefore, they should be mentioned in the official final documentation, for example, in the reported article.

Page URL

https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/collection-expressions

Content source URL

https://github.com/dotnet/docs/blob/main/docs/csharp/language-reference/operators/collection-expressions.md

Document Version Independent Id

49349466-94fd-00b5-93da-e4200d9f9ec8

Article author

@BillWagner

Metadata


Associated WorkItem - 226922

BillWagner commented 7 months ago

Thanks for providing all these details @SENya1990

This is great information for the next update to this article.

I've added this to our next sprint, and I'll make these updates.