fsharp / fslang-suggestions

The place to make suggestions, discuss and vote on F# language and core library features
342 stars 20 forks source link

Type-directed resolution of [ .. ] syntax #1086

Open dsyme opened 2 years ago

dsyme commented 2 years ago

I propose we consider doing type-directed resolution of the [...] syntax if used with a non-list known type.

type SomeApi() =
    static member SomeEntryPoint(xs: ImmutableArray<_>) = ...

SomeApi.SomeEntryPoint([1; 2; 3])

The main thing here is that whenever a method needs a collection (of more or less any kind), as input, you can always use, for example, [1;2;3] and not think about it.

This means F# APIs can reasonable choose to adopt ImmutableArray or other such collection types for inputs without particularly changing user code.

If this is considered reasonable, the question arises about when this is allowed. For example, it's possible that it should just be at method argument position - F# has a tradition of only applying such "magic" (e.g. lambda --> delegate, ParamArray and more recently op_Implicit) only at method applications. However in theory it could also be at any position by type-directed resolution, e.g.

let x: ImmutableArray<_> = [1;2;3]

For some reason people don't seem to mind these transformations at method calls.

From a previous suggestion:

Have the [ ... ] syntax resolved in a type-directed way (e.g. "if the target type is known to be a collection C with a builder pattern or IEnumerable constructor, then treat the syntax as if it is constructing a C, otherwise treat it as a regular F# list"))?

The syntax would work with

This is somewhat related to the ListCollector type in F# 6 which is use as our builder for F# immutable lists and which gave the performance results described here. The only subtlety there is the presence of AddManyAndClose that allows lists to be collected in tail position without copying the list.

The underlying problem here is that we over-emphasise the "list" type in F# - that is, in an API designer wants to get nice, minimal callside syntax, they have no choice but to accept F# lists. However this has problems

Pros and Cons

The advantages of making this adjustment to F# are:

The disadvantages of making this adjustment to F# are:

Extra information

Estimated cost (XS, S, M, L, XL, XXL): M

Related suggestions: (put links to related suggestions here)

Affidavit (please submit!)

Please tick this by placing a cross in the box:

Please tick all that apply:

For Readers

If you would like to see this issue implemented, please click the :+1: emoji on this issue. These counts are used to generally order the suggestions by engagement.

dsyme commented 2 years ago

Note this is made much easier by https://github.com/dotnet/fsharp/pull/11592

Separately, there was discussion about whether :: and @ should also work for other collection types, presumably through specific translation in type checking when either argument or the result is already known to be a non-list collection type. In those cases

x :: ys   
xs @ ys   // where xs, ys or Result type has type C

could be processed equivalently to:


([ yield x
   yield! ys ]: C)

([ yield! xs
   yield! ys ]: C)

Respectively. This may however be too subtle.

isaacabraham commented 2 years ago

I'm (as usual - sorry) not convinced by this. Are we saying that this is to replace e.g.

let x = Set [ 1 .. 10 ]

with

let x:Set = [ 1 .. 10 ]

I suppose if the type of the LHS is e.g. dictated by the function return type or some library you don't own, it saves a few characters - I'm just not convinced by that the saving is really worth it in comparison to either effort spent on other features + extra cost of "magic" conversions happening (I know it's type directed, but there's still some magic going on).

baronfel commented 2 years ago

I generally like this, I think

Mutable collection types that have a constructor (optionally with a capacity arg), and an 'Add' method to add the elements.

is going to be the hardest to realistically implement though, because I'm not sure there's a consistent pattern for collection types with a capacity parameter argument. I haven't done an API search across the BCL or anything to verify this gut feeling yet though. Error messages for these conversions would need to be spot-on as well to prevent user confusion and potentially allow for IDE code fixes.

cartermp commented 2 years ago

@isaacabraham

it saves a few characters

Something worth exploring is seeing what performance benefits (if any) could be added here.

Today your example:

  1. Creates a sequence
  2. Converts that sequence into a list
  3. Constructs the set with the IEnumerable-acceptable constructor
  4. Uses IEnumerable constructs to add items to the underlying set structure

With a little more compiler analysis, I imagine that at least step 2 could be eliminated.

isaacabraham commented 2 years ago

@cartermp performance is something I hadn't even considered. From what I'm reading above though, won't step 2 be eliminated anyway by https://github.com/dotnet/fsharp/pull/11592?

baronfel commented 2 years ago

spelling out what might be possible with this feature

let makeWhackyCollection() : ImmutableList<_> =
    [ 1; 2
      for x in 1 .. 100 do
          x+4
      if a > b then 
          for x in 1 .. 100 do 
              x+5
              x+7
      3; 4 ]

ImmutableList has both an Add and an AddRange member, so based on this compiler analysis we might be able to emit something like:

let ilist = ImmutableList.Empty.ToBuilder() // TODO: some syntax for expected size, or known size when dealing with literals?
                                            // TODOx2: ImmutableList has no public constructor so sized-based optimizations aren't applicable.  How should we detect the 'base' instance to use in these cases?
ilist.Add 1
ilist.Add 2
ilist.AddRange [5..104]
ilist.Add 3
ilist.Add 4
ilist.ToImmutable()

in the case where a > b, and perhaps based on heuristics even the individual adds might be fused together. IMO the perf would come from no intermediate collection, appropriate sizing to prevent resizing during creation, and judicious use of AddRange to eliminate per-Add overheads.

Implementation Note: ImmutableList has no public constructor so sized-based optimizations aren't applicable. How should we detect the 'base' instance to use in these cases? For ImmutableList you're supposed to use .Empty, potentially with the ToBuilder() method, do we have to resort to probing particular members? Is there a consistent pattern in the BCL/popular third-party collections for this case?

Lanayx commented 2 years ago

Seems to be similar to https://github.com/fsharp/fslang-suggestions/issues/625

isaacabraham commented 2 years ago

@baronfel that seems potentially neat - I do wonder whether putting something that has such potential benefits behind something as implicit as a type-directed conversion is the right thing to do. I would almost certainly like to see a library-level function that could be called in lieu of this e.g. [ 1 .. 100 ] |> FSharp.Core.fastConvert<ImmutableArray> or something.

baronfel commented 2 years ago

@isaacabraham that would be tough to do at runtime I think, because at runtime you already have the 'constructed' version of the list/seq/array/etc which loses the potential benefit of doing away with the intermediate collection. perhaps if you had an Expr<'t list> then such a function could perform a fast-rewrite? not 100% sure how that would work.

isaacabraham commented 2 years ago

Just to elaborate on my previous comment of "magic" - one of the strengths of F# is that this sort of stuff is very, very rare in the language. Nearly everything is explicit in terms of conversions.

yield was an interesting one in that it introduced a few weird corner cases but I think on the whole it solved way more problems than it introduced (especially around Elmish). Thankfully most people we see learning F# don't come across many such corner cases so there's not much to educate on (although having to explain yield, yield! and implicit yield isn't something I enjoy doing on those rare occasions where it's needed).

The new implicit op support in F#6 is another one that I think again will probably help more than it hinders e.g. going from some concrete type to a base class, and it's pretty easy to explain this one, but it'll require some explanation of what op_implicit is etc. - something that until now has been a completely foreign concept for F# developers.

This feature looks like going a step further - explanation of constructors and builder patterns, and the fact that let x : Set = [ 1 .. 10 ] isn't the same from a performance point of view as let x = Set [ 1 .. 10 ] will be challenging.

dsyme commented 2 years ago

One main thing here is that whenever a method needs a collection (of more or less any kind), as input, you can always use "[1;2;3]" and not think about it. I'll add that to the notes.

It's possible that it should just be at method argument position for that reason. F# has a long tradition of applying such "magic" (e.g. lambda --> delegate, and now op_Implicit) only at method argument position.

Isaac's concerns are very reasonable.

KathleenDollard commented 2 years ago

The conversation on performance is interesting. I do not think we should add this feature the way it's described in the issue just for performance.

If it's profound enough to be visible in performance scenarios, the logical conversation is whether the default should/could be faster, or a special "fast" method supplying more info for optimization is worthwhile.

I'm thinking more about the readability. Is it helpful to reduce the concept count on grouping syntax, particularly with the curly for sequence. Here are some variations:

let xx = seq { "red"; "yellow"; "blue"}
let yy = ["red"; "yellow"; "blue"] |> List.toSeq
let zz = ["red"; "yellow"; "blue"] : seq<string>
let vv: seq<string> = ["red"; "yellow"; "blue"]

I embrace that my eye on this is still relatively naïve, but from that perspective the first stands out as more confusing visually. It doesn't use square brackets (as array does, just with the extra pipe). The presence of "seq" and the lack of field names clue me in that this is not a normal record creation. Since record creation is much more common, and I may be coming from a language that uses curly for code block delimiters, curlies don't jump to mind for a collection-like thing.

From a naïve point of view, the nuances between the last three are interesting.

Lanayx commented 2 years ago

After thinking it over I've got one concern - in cases where I mostly don't specify types it will default to list which is a bad default (from performance perspective) in most cases, so I would rather specify seq or array directly to be on the safe side (since it's easier to type than a full type). So I would like to see a warning when such defaulting to list occurs, such warning can be opt-in, but should be present

dsyme commented 2 years ago

I adjusted the issue description a little to take this comment of mine into account.

teknikal-wizard commented 2 years ago

My initial thought on this that I will find it harder to know what code represents just by looking at it. I would need to look at type hints to know what collection I have.

Also, by changing the type args of the called function, you will change the type of the passed data, which might be surprising, particularly if it goes from an eager to a lazy collection (unless I misunderstand something!?).

dsyme commented 2 years ago

Also, by changing the type args of the called function, you will change the type of the passed data, which might be surprising,

I think this is to be expected

particularly if it goes from an eager to a lazy collection (unless I misunderstand something!?).

I'd imagine [ ... ] always be eager.

teknikal-wizard commented 2 years ago

So would that mean a list-comprehension style computation (like makeWhackyCollection above) defined as [ ... ] and passed to a function that expects a seq would be evaluated immediately, but if we define it with seq {} it won't? Of course this is currently the case with the auto List :> IEnumerable cast but from what I understand, the [ ... ] isn't going to start as a list and be converted, it will be a seq from the outset via the type inference.

This is a very weak and hand wavy, gut feel kind of comment so feel free to ignore, but one of the reasons I fell in love with F# was that it was simple to learn, there tended to be one way, or at least one common way, of doing the elementary operations so one person's F# is generally much like another's. It is also easy to reason about at a glance. Once there are multiple 'styles' of F# and added implicit behaviour it might risk losing some of that charm.

dsyme commented 2 years ago

the [ ... ] isn't going to start as a list and be converted, it will be a seq from the outset via the type inference.

The [...] syntax could only really be an eagerly-evaluated collection. Given the rough guideline of rules above, it could only be used against collection types that define either an eager builder pattern or an eager create+Add pattern.

Once there are multiple 'styles' of F# and added implicit behaviour it might risk losing some of that charm.

Yes, understood.

The underlying problem here is that we over-emphasise the "list" type in F# - that is, in an API designer wants to get nice, minimal callside syntax, they have no choice but to accept F# lists. However this has problems

Ideally an API could accept ImmutableArray (or similar) and store those directly internally and not suffer any overheads in performance or usability.

APIs can also accept IEnumerable<_>, which is not too bad, though this still encourages people to use [...] on the callside.

teknikal-wizard commented 2 years ago

that is, in an API designer wants to get nice, minimal callside syntax, they have no choice but to accept F# lists

Are flexible types (#seq) useful here? It feels like almost the same result, in that the library can accept anything that implements IEnumerable, and the caller doesn't need to explicitly cast so could pass in any collection they like.

I guess it doesn't help the perf issues in the same way as you would still start with a List if the caller chose that type.

Edit: I think that is the point you were making at the end of your last post, sorry! I guess the question is, whose responsibility is it to choose the collection type they want to use, the caller or the target?

I suppose the issue as just stated is that library designers want to offer a simple API, so feel obliged to accept a range of types including List where they actually want a specific, performant collection.

Still not personally convinced that this is best solved through inference rather than good old functions and types! I would just ask for what I need explictly. If you want an array, ask for an array, not a seq etc.

an API designer wants to get nice, minimal callside syntax, they have no choice but to accept F# lists

If the syntax is really the problem, that could always be improved?

If I had to call a lib that required an array and it was annoying me as I wanted to use lists, I could write a tiny function that adapts it for me and hide it away somewhere, or even shadow the original function. It would be my problem to fix, as I am the one with the issue, not the library author.

Anyway, I'll butt out now, and I appreciate the replies! :)

charlesroddie commented 2 years ago

It would help to evaluate this if we understood list -> ImmutableArray conversion thoroughly which I agree is important at this point for F#. It would be nice to make this easier and the proposal does that some of the time. But not all of the time, so there is a consistency problem.

Original API:

type T =
    static member A(x:int list, y:int list) = ()

Usage:

let x = [ 0 ]
T.A(x, [ 1 ])

With this suggestion, after changing the API to ImmutableArrays instead of lists, a user will only have to alter the code defining x:

let x = [: 0 :]
T.A(x, [ 1 ])

What does this represent in terms of code style?

  1. An intermediate state, getting something to work quickly, to be homogenized to A(x, [: 1 :]) later?
  2. An intermediate state, to be homogonized to let x = [ 0 ] when F# allows choosing the meaning of []?
  3. A final state, where users use [] freely and only change it if there are compiler errors (on the let x line).
  4. A final state, where users understand type-directed resolutions and intentionally use [ 1 ] as a shorthand.

Those are the only options I can think of, but 1 is not good because it explicitly allows code that isn't in a final state, 2 is unlikely to happen, 3 accepts a moderate degree of user confusion, and 4 requires an unreasonable level of knowledge of details to code in F#.

Happypig375 commented 2 years ago

@charlesroddie There is precedence.

type T =
    static member A(x: PrintfFormat<_,_,_,_>, y: PrintfFormat<_,_,_,_>) = ()
T.A("%O", "%O") // This is fine
let x = "%O"
T.A(x, "%O") // This errs
dsyme commented 2 years ago

@charlesroddie Yes, it's the core problem I agree

There is precedence.

@Happypig375 This is true technically - and there are other examples (including some others from F# 2 onwards, and some introduced in F# 6). But I think @charlesroddie's point is that this change in particular would greatly increase the frequency of this in beginner-facing code, and in particular concern that it requires "an unreasonable level of knowledge of details" of type inference to understand why things can be dropped in some situations and not others.

roboz0r commented 2 years ago

The heart of the issues seems to be that the best collection instantiation syntax [ .. ] was used for list. This would essentially be a potentially breaking change to now use the [ .. ] syntax for an abstract CollectionBuilder type (or whatever name is chosen) that the compiler re-writes to an efficient allocation of potentially any IEnumerable at compile-time.

This would be great for Fable / Feliz UI syntax and other API's that expect different collection types without needing to pollute the code with calls to constructors and with better performance than allocating a List and also the target type. I think that perhaps the best experience from a beginner's perspective to minimise surprises (and working with the tooling) is to make the CollectionBuilder type visible to tooling during editing.

let x = // CollectionBuilder<int>
    [1; 2; 3] 

The inference changes at the first call-site.

x |> List.map ((+) 1)  // Inference makes a list
MyAPI.Call(x) // Inferred based on method signature
let y : int array = // Inferred array at declaration
    [1; 2; 3] 

If x is never used in subsequent code (seems unlikely except for prototyping) I'd suggest the compiler emits a warning that CollectionBuilder will be compiled to a list. You'd get rid of the warning with a type annotation let x : list<_> = [1; 2; 3] in this case.

This would be a similar experience to the way numeric types are inferred to be int without explicit annotations except that an additional warning is given.

dsyme commented 2 years ago

@roboz0r I edited your reply to use lower case list as upper case List<_> can be confused with System.Collections.Generic.List<_>

This would essentially be a potentially breaking change

Just to mention that we'd only apply the rule if the expression is already known to have a type that can never unify with _ list.

xperiandri commented 2 years ago

Is this feature still in discussion or is already approved?

dsyme commented 2 years ago

In discussion. There are plenty of reasons not to do this....

brianrourkeboll commented 1 year ago

A few comments (some of which I have been meaning to leave for a year or more...):

Computation expressions

Perhaps this is old news, but I noticed some time ago that RFC FS-1087 enabled creating computation expressions that effectively support a subset of this, albeit only for construction and not pattern-matching.

For example, a type-directed collection computation expression can be used to construct any mutable collection that has a default constructor and an Add method:

let xs : ResizeArray<int> = collection { 1; 2; 3 }
type T =
    { A : HashSet<string>
      B : SortedSet<int> }

let t =
    { A = collection { "a"; "b"; "c" }
      B = collection { 3; 2; 1 } }

While it works, I don't know that I would personally make much use of it as long as the target type must be known. The type-specific resizeArray, dictionary, etc., builders are nicer in practice until there is a way to suspend inference of the target type—or specify a default.

Generic parameter defaults

(Let me know if you think this is worth moving to its own suggestion, even if only to be rejected on the record.)

Has anyone (in any context) proposed something like TypeScript's generic parameter defaults for F# or C# (or the CLR—it would likely be better if both languages supported the feature)? Edit: All I found was https://github.com/dotnet/csharplang/discussions/278, which doesn't look like it got any traction and hasn't been touched in 6 years, stemming from https://github.com/dotnet/roslyn/issues/6248...

I haven't thought through how feasible it might be to implement, but it seems to me that generic parameter defaults[^1] could be a general solution to the problem of supporting type-directed resolution of things while enabling the avoidance of resolution ambiguities.

It would incidentally also provide a first-class way of representing the F# compiler's existing behavior of defaulting to obj (as in #696), int, and seq<'a> in various otherwise generic (unannotated) scenarios, as well as the so-called "natural type" of expressions as described in the C# collection literals proposal.

That is, with generic parameter defaults, the provisional type of a collection literal defined with [ .. ] might look something like:[^2]

let (xs : 'a = int list) = [1; 2; 3]

The value xs could be used anywhere 'a (when inferred) or int list could be used, with the following behavior:

  1. If a concrete type could be inferred for 'a by its usage (e.g., when used as a function parameter, assigned to a record field, etc.), 'a would be constrained to that concrete type.
  2. If the usage site would not otherwise constraint 'a to a concrete type, the specified default type (int list in this case) would be resolved,[^3] unless the usage site were itself also generic, in which case the original generic type 'a = int list would propagate until the last usage site, at which point the specified default type would be resolved.

This behaves somewhat similarly to the compiler's existing defaulting to int when arithmetic operators are used in non-inlined functions.

Consider:

let plus x y = x + y

Without further information, the inferred type is currently int -> int -> int—there is nothing whatsoever in the signature to indicate that it could be used with any other type—but it can in fact be used with any other type with a (+) operator, e.g.:

let s = plus "a" "b"

With generic parameter defaults, the signature would become something like:

val plus : x:('a = int) -> y:('b = int) -> ('c = int)

Of course, just as already happens with a non-inlined generic function like plus, if 'a, 'b, or 'c were resolved to a type not compatible with the function body, a compile-time error would be raised.

Consider also a function like this:

let f xs = for x in xs do printfn $"{x}"

whose signature right now is

val f : xs:seq<'a> -> unit

but whose signature with generic parameter defaults would in isolation become:

val f : xs:('a = seq<'b>) -> unit

Here's how the inference would work together:

let f xs = for x in xs do printfn $"{x}"
let xs = [1; 2; 3]
f xs

The resolution of int list is driven by the unannotated [ .. ] literal and its 'a = 'b list default

val f : xs:int list -> unit
val xs : int list
val it : unit

But if a concrete type were annotated at any point, it would take precedence:

The int array annotations oust the = _ list default of the [ .. ] literal

let f (xs : int array) = for x in xs do printfn $"{x}"
let xs = [1; 2; 3]
f xs

or

let f xs = for x in xs do printfn $"{x}"
let xs : int array = [1; 2; 3]
f xs

would both yield

val f : xs:int array -> unit
val xs : int array
val it : unit

C# collection literals

I haven't really paid attention to how the C# collection literal proposal (https://github.com/dotnet/csharplang/issues/5354) has been going, but it may be worth considering which potential solutions might apply to both languages—for example, perhaps generic parameter defaults could work for both languages as a first-class representation of the "natural type" idea described in the C# proposal (as well as for the scenarios outlined above in F#, and likely others).

[^1]: I at first also considered whether it would make sense to also require specifying some kind of constraint on the generic type parameter (like 'a = int when 'a : (static member (+) : ...) or #seq<int> = int list or 'a = int list when 'a :> seq<int>), but for collection literals it would not work for Span<_> (which does not implement IEnumerable<_>), and it rather implies that the type after the = must derive in some way from the type before it, which relationship would nevertheless need to be specified, and which would in some cases require an inline context. [^2]: The parentheses in an annotation like (xs : #seq<int> = int list) would be required, because this is already valid code that compiles:

    let list = [1; 2; 3]
    let int = id
    let (=) x y = x @ y
    let xs : #seq<int> = int list = [1; 2; 3]

[^3]: Something like this actually already happens, albeit with a warning—in the following, xs is constrained to be int list.

    let xs : #seq<int> = Unchecked.defaultof<_>
    let len = List.length xs
Happypig375 commented 1 year ago

@dsyme Thoughts on above comment?

Happypig375 commented 1 year ago

@dsyme ?

dsyme commented 1 year ago

@Happypig375 I have read it - I'm thinking it over. There is a defaulting mechanism used for constructs in FSharp.Core, for example.

Happypig375 commented 6 months ago

https://github.com/dotnet/csharplang/blob/main/proposals/csharp-12.0/collection-expressions.md We should definitely consider the equivalent C# feature here.