Open dsyme opened 2 years ago
Note this is made much easier by https://github.com/dotnet/fsharp/pull/11592
Separately, there was discussion about whether ::
and @
should also work for other collection types, presumably through specific translation in type checking when either argument or the result is already known to be a non-list collection type. In those cases
x :: ys
xs @ ys // where xs, ys or Result type has type C
could be processed equivalently to:
([ yield x
yield! ys ]: C)
([ yield! xs
yield! ys ]: C)
Respectively. This may however be too subtle.
I'm (as usual - sorry) not convinced by this. Are we saying that this is to replace e.g.
let x = Set [ 1 .. 10 ]
with
let x:Set = [ 1 .. 10 ]
I suppose if the type of the LHS is e.g. dictated by the function return type or some library you don't own, it saves a few characters - I'm just not convinced by that the saving is really worth it in comparison to either effort spent on other features + extra cost of "magic" conversions happening (I know it's type directed, but there's still some magic going on).
I generally like this, I think
Mutable collection types that have a constructor (optionally with a capacity arg), and an 'Add' method to add the elements.
is going to be the hardest to realistically implement though, because I'm not sure there's a consistent pattern for collection types with a capacity parameter argument. I haven't done an API search across the BCL or anything to verify this gut feeling yet though. Error messages for these conversions would need to be spot-on as well to prevent user confusion and potentially allow for IDE code fixes.
@isaacabraham
it saves a few characters
Something worth exploring is seeing what performance benefits (if any) could be added here.
Today your example:
With a little more compiler analysis, I imagine that at least step 2 could be eliminated.
@cartermp performance is something I hadn't even considered. From what I'm reading above though, won't step 2 be eliminated anyway by https://github.com/dotnet/fsharp/pull/11592?
spelling out what might be possible with this feature
let makeWhackyCollection() : ImmutableList<_> =
[ 1; 2
for x in 1 .. 100 do
x+4
if a > b then
for x in 1 .. 100 do
x+5
x+7
3; 4 ]
ImmutableList has both an Add
and an AddRange
member, so based on this compiler analysis we might be able to emit something like:
let ilist = ImmutableList.Empty.ToBuilder() // TODO: some syntax for expected size, or known size when dealing with literals?
// TODOx2: ImmutableList has no public constructor so sized-based optimizations aren't applicable. How should we detect the 'base' instance to use in these cases?
ilist.Add 1
ilist.Add 2
ilist.AddRange [5..104]
ilist.Add 3
ilist.Add 4
ilist.ToImmutable()
in the case where a > b, and perhaps based on heuristics even the individual adds might be fused together. IMO the perf would come from no intermediate collection, appropriate sizing to prevent resizing during creation, and judicious use of AddRange to eliminate per-Add overheads.
Implementation Note: ImmutableList has no public constructor so sized-based optimizations aren't applicable. How should we detect the 'base' instance to use in these cases? For ImmutableList you're supposed to use .Empty
, potentially with the ToBuilder()
method, do we have to resort to probing particular members? Is there a consistent pattern in the BCL/popular third-party collections for this case?
Seems to be similar to https://github.com/fsharp/fslang-suggestions/issues/625
@baronfel that seems potentially neat - I do wonder whether putting something that has such potential benefits behind something as implicit as a type-directed conversion is the right thing to do. I would almost certainly like to see a library-level function that could be called in lieu of this e.g. [ 1 .. 100 ] |> FSharp.Core.fastConvert<ImmutableArray>
or something.
@isaacabraham that would be tough to do at runtime I think, because at runtime you already have the 'constructed' version of the list/seq/array/etc which loses the potential benefit of doing away with the intermediate collection. perhaps if you had an Expr<'t list>
then such a function could perform a fast-rewrite? not 100% sure how that would work.
Just to elaborate on my previous comment of "magic" - one of the strengths of F# is that this sort of stuff is very, very rare in the language. Nearly everything is explicit in terms of conversions.
yield
was an interesting one in that it introduced a few weird corner cases but I think on the whole it solved way more problems than it introduced (especially around Elmish). Thankfully most people we see learning F# don't come across many such corner cases so there's not much to educate on (although having to explain yield, yield! and implicit yield isn't something I enjoy doing on those rare occasions where it's needed).
The new implicit op support in F#6 is another one that I think again will probably help more than it hinders e.g. going from some concrete type to a base class, and it's pretty easy to explain this one, but it'll require some explanation of what op_implicit
is etc. - something that until now has been a completely foreign concept for F# developers.
This feature looks like going a step further - explanation of constructors and builder patterns, and the fact that let x : Set = [ 1 .. 10 ]
isn't the same from a performance point of view as let x = Set [ 1 .. 10 ]
will be challenging.
One main thing here is that whenever a method needs a collection (of more or less any kind), as input, you can always use "[1;2;3]" and not think about it. I'll add that to the notes.
It's possible that it should just be at method argument position for that reason. F# has a long tradition of applying such "magic" (e.g. lambda --> delegate, and now op_Implicit) only at method argument position.
Isaac's concerns are very reasonable.
The conversation on performance is interesting. I do not think we should add this feature the way it's described in the issue just for performance.
If it's profound enough to be visible in performance scenarios, the logical conversation is whether the default should/could be faster, or a special "fast" method supplying more info for optimization is worthwhile.
I'm thinking more about the readability. Is it helpful to reduce the concept count on grouping syntax, particularly with the curly for sequence. Here are some variations:
let xx = seq { "red"; "yellow"; "blue"}
let yy = ["red"; "yellow"; "blue"] |> List.toSeq
let zz = ["red"; "yellow"; "blue"] : seq<string>
let vv: seq<string> = ["red"; "yellow"; "blue"]
I embrace that my eye on this is still relatively naïve, but from that perspective the first stands out as more confusing visually. It doesn't use square brackets (as array does, just with the extra pipe). The presence of "seq" and the lack of field names clue me in that this is not a normal record creation. Since record creation is much more common, and I may be coming from a language that uses curly for code block delimiters, curlies don't jump to mind for a collection-like thing.
From a naïve point of view, the nuances between the last three are interesting.
After thinking it over I've got one concern - in cases where I mostly don't specify types it will default to list which is a bad default (from performance perspective) in most cases, so I would rather specify seq or array directly to be on the safe side (since it's easier to type than a full type). So I would like to see a warning when such defaulting to list occurs, such warning can be opt-in, but should be present
I adjusted the issue description a little to take this comment of mine into account.
My initial thought on this that I will find it harder to know what code represents just by looking at it. I would need to look at type hints to know what collection I have.
Also, by changing the type args of the called function, you will change the type of the passed data, which might be surprising, particularly if it goes from an eager to a lazy collection (unless I misunderstand something!?).
Also, by changing the type args of the called function, you will change the type of the passed data, which might be surprising,
I think this is to be expected
particularly if it goes from an eager to a lazy collection (unless I misunderstand something!?).
I'd imagine [ ... ]
always be eager.
So would that mean a list-comprehension style computation (like makeWhackyCollection above) defined as [ ... ]
and passed to a function that expects a seq
would be evaluated immediately, but if we define it with seq {}
it won't? Of course this is currently the case with the auto List :> IEnumerable cast but from what I understand, the [ ... ]
isn't going to start as a list and be converted, it will be a seq from the outset via the type inference.
This is a very weak and hand wavy, gut feel kind of comment so feel free to ignore, but one of the reasons I fell in love with F# was that it was simple to learn, there tended to be one way, or at least one common way, of doing the elementary operations so one person's F# is generally much like another's. It is also easy to reason about at a glance. Once there are multiple 'styles' of F# and added implicit behaviour it might risk losing some of that charm.
the [ ... ] isn't going to start as a list and be converted, it will be a seq from the outset via the type inference.
The [...]
syntax could only really be an eagerly-evaluated collection. Given the rough guideline of rules above, it could only be used against collection types that define either an eager builder pattern or an eager create+Add
pattern.
Once there are multiple 'styles' of F# and added implicit behaviour it might risk losing some of that charm.
Yes, understood.
The underlying problem here is that we over-emphasise the "list" type in F# - that is, in an API designer wants to get nice, minimal callside syntax, they have no choice but to accept F# lists. However this has problems
list
throughout. but internally converts to arrays.Ideally an API could accept ImmutableArray
(or similar) and store those directly internally and not suffer any overheads in performance or usability.
APIs can also accept IEnumerable<_>
, which is not too bad, though this still encourages people to use [...]
on the callside.
that is, in an API designer wants to get nice, minimal callside syntax, they have no choice but to accept F# lists
Are flexible types (#seq
) useful here? It feels like almost the same result, in that the library can accept anything that implements IEnumerable, and the caller doesn't need to explicitly cast so could pass in any collection they like.
I guess it doesn't help the perf issues in the same way as you would still start with a List if the caller chose that type.
Edit: I think that is the point you were making at the end of your last post, sorry! I guess the question is, whose responsibility is it to choose the collection type they want to use, the caller or the target?
I suppose the issue as just stated is that library designers want to offer a simple API, so feel obliged to accept a range of types including List where they actually want a specific, performant collection.
Still not personally convinced that this is best solved through inference rather than good old functions and types! I would just ask for what I need explictly. If you want an array, ask for an array, not a seq etc.
an API designer wants to get nice, minimal callside syntax, they have no choice but to accept F# lists
If the syntax is really the problem, that could always be improved?
If I had to call a lib that required an array and it was annoying me as I wanted to use lists, I could write a tiny function that adapts it for me and hide it away somewhere, or even shadow the original function. It would be my problem to fix, as I am the one with the issue, not the library author.
Anyway, I'll butt out now, and I appreciate the replies! :)
It would help to evaluate this if we understood list -> ImmutableArray
conversion thoroughly which I agree is important at this point for F#. It would be nice to make this easier and the proposal does that some of the time. But not all of the time, so there is a consistency problem.
Original API:
type T =
static member A(x:int list, y:int list) = ()
Usage:
let x = [ 0 ]
T.A(x, [ 1 ])
With this suggestion, after changing the API to ImmutableArray
s instead of list
s, a user will only have to alter the code defining x
:
let x = [: 0 :]
T.A(x, [ 1 ])
What does this represent in terms of code style?
A(x, [: 1 :])
later?let x = [ 0 ]
when F# allows choosing the meaning of []
?[]
freely and only change it if there are compiler errors (on the let x
line).[ 1 ]
as a shorthand.Those are the only options I can think of, but 1 is not good because it explicitly allows code that isn't in a final state, 2 is unlikely to happen, 3 accepts a moderate degree of user confusion, and 4 requires an unreasonable level of knowledge of details to code in F#.
@charlesroddie There is precedence.
type T =
static member A(x: PrintfFormat<_,_,_,_>, y: PrintfFormat<_,_,_,_>) = ()
T.A("%O", "%O") // This is fine
let x = "%O"
T.A(x, "%O") // This errs
@charlesroddie Yes, it's the core problem I agree
There is precedence.
@Happypig375 This is true technically - and there are other examples (including some others from F# 2 onwards, and some introduced in F# 6). But I think @charlesroddie's point is that this change in particular would greatly increase the frequency of this in beginner-facing code, and in particular concern that it requires "an unreasonable level of knowledge of details" of type inference to understand why things can be dropped in some situations and not others.
The heart of the issues seems to be that the best collection instantiation syntax [ .. ]
was used for list
. This would essentially be a potentially breaking change to now use the [ .. ]
syntax for an abstract CollectionBuilder
type (or whatever name is chosen) that the compiler re-writes to an efficient allocation of potentially any IEnumerable
at compile-time.
This would be great for Fable / Feliz UI syntax and other API's that expect different collection types without needing to pollute the code with calls to constructors and with better performance than allocating a List and also the target type.
I think that perhaps the best experience from a beginner's perspective to minimise surprises (and working with the tooling) is to make the CollectionBuilder
type visible to tooling during editing.
let x = // CollectionBuilder<int>
[1; 2; 3]
The inference changes at the first call-site.
x |> List.map ((+) 1) // Inference makes a list
MyAPI.Call(x) // Inferred based on method signature
let y : int array = // Inferred array at declaration
[1; 2; 3]
If x
is never used in subsequent code (seems unlikely except for prototyping) I'd suggest the compiler emits a warning that CollectionBuilder
will be compiled to a list
. You'd get rid of the warning with a type annotation let x : list<_> = [1; 2; 3]
in this case.
This would be a similar experience to the way numeric types are inferred to be int
without explicit annotations except that an additional warning is given.
@roboz0r I edited your reply to use lower case list
as upper case List<_>
can be confused with System.Collections.Generic.List<_>
This would essentially be a potentially breaking change
Just to mention that we'd only apply the rule if the expression is already known to have a type that can never unify with _ list
.
Is this feature still in discussion or is already approved?
In discussion. There are plenty of reasons not to do this....
A few comments (some of which I have been meaning to leave for a year or more...):
Perhaps this is old news, but I noticed some time ago that RFC FS-1087 enabled creating computation expressions that effectively support a subset of this, albeit only for construction and not pattern-matching.
For example, a type-directed collection
computation expression can be used to construct any mutable collection that has a default constructor and an Add
method:
let xs : ResizeArray<int> = collection { 1; 2; 3 }
type T =
{ A : HashSet<string>
B : SortedSet<int> }
let t =
{ A = collection { "a"; "b"; "c" }
B = collection { 3; 2; 1 } }
While it works, I don't know that I would personally make much use of it as long as the target type must be known. The type-specific resizeArray
, dictionary
, etc., builders are nicer in practice until there is a way to suspend inference of the target type—or specify a default.
(Let me know if you think this is worth moving to its own suggestion, even if only to be rejected on the record.)
Has anyone (in any context) proposed something like TypeScript's generic parameter defaults for F# or C# (or the CLR—it would likely be better if both languages supported the feature)? Edit: All I found was https://github.com/dotnet/csharplang/discussions/278, which doesn't look like it got any traction and hasn't been touched in 6 years, stemming from https://github.com/dotnet/roslyn/issues/6248...
I haven't thought through how feasible it might be to implement, but it seems to me that generic parameter defaults[^1] could be a general solution to the problem of supporting type-directed resolution of things while enabling the avoidance of resolution ambiguities.
It would incidentally also provide a first-class way of representing the F# compiler's existing behavior of defaulting to obj
(as in #696), int
, and seq<'a>
in various otherwise generic (unannotated) scenarios, as well as the so-called "natural type" of expressions as described in the C# collection literals proposal.
That is, with generic parameter defaults, the provisional type of a collection literal defined with [ .. ]
might look something like:[^2]
let (xs : 'a = int list) = [1; 2; 3]
The value xs
could be used anywhere 'a
(when inferred) or int list
could be used, with the following behavior:
'a
by its usage (e.g., when used as a function parameter, assigned to a record field, etc.), 'a
would be constrained to that concrete type.'a
to a concrete type, the specified default type (int list
in this case) would be resolved,[^3] unless the usage site were itself also generic, in which case the original generic type 'a = int list
would propagate until the last usage site, at which point the specified default type would be resolved.This behaves somewhat similarly to the compiler's existing defaulting to int
when arithmetic operators are used in non-inlined functions.
Consider:
let plus x y = x + y
Without further information, the inferred type is currently int -> int -> int
—there is nothing whatsoever in the signature to indicate that it could be used with any other type—but it can in fact be used with any other type with a (+)
operator, e.g.:
let s = plus "a" "b"
With generic parameter defaults, the signature would become something like:
val plus : x:('a = int) -> y:('b = int) -> ('c = int)
Of course, just as already happens with a non-inlined generic function like plus
, if 'a
, 'b
, or 'c
were resolved to a type not compatible with the function body, a compile-time error would be raised.
Consider also a function like this:
let f xs = for x in xs do printfn $"{x}"
whose signature right now is
val f : xs:seq<'a> -> unit
but whose signature with generic parameter defaults would in isolation become:
val f : xs:('a = seq<'b>) -> unit
Here's how the inference would work together:
let f xs = for x in xs do printfn $"{x}"
let xs = [1; 2; 3]
f xs
int list
is driven by the unannotated [ .. ]
literal and its 'a = 'b list
defaultval f : xs:int list -> unit
val xs : int list
val it : unit
But if a concrete type were annotated at any point, it would take precedence:
int array
annotations oust the = _ list
default of the [ .. ]
literallet f (xs : int array) = for x in xs do printfn $"{x}"
let xs = [1; 2; 3]
f xs
or
let f xs = for x in xs do printfn $"{x}"
let xs : int array = [1; 2; 3]
f xs
would both yield
val f : xs:int array -> unit
val xs : int array
val it : unit
I haven't really paid attention to how the C# collection literal proposal (https://github.com/dotnet/csharplang/issues/5354) has been going, but it may be worth considering which potential solutions might apply to both languages—for example, perhaps generic parameter defaults could work for both languages as a first-class representation of the "natural type" idea described in the C# proposal (as well as for the scenarios outlined above in F#, and likely others).
[^1]: I at first also considered whether it would make sense to also require specifying some kind of constraint on the generic type parameter (like 'a = int when 'a : (static member (+) : ...)
or #seq<int> = int list
or 'a = int list when 'a :> seq<int>
), but for collection literals it would not work for Span<_>
(which does not implement IEnumerable<_>
), and it rather implies that the type after the =
must derive in some way from the type before it, which relationship would nevertheless need to be specified, and which would in some cases require an inline
context.
[^2]: The parentheses in an annotation like (xs : #seq<int> = int list)
would be required, because this is already valid code that compiles:
let list = [1; 2; 3]
let int = id
let (=) x y = x @ y
let xs : #seq<int> = int list = [1; 2; 3]
[^3]: Something like this actually already happens, albeit with a warning—in the following, xs
is constrained to be int list
.
let xs : #seq<int> = Unchecked.defaultof<_>
let len = List.length xs
@dsyme Thoughts on above comment?
@dsyme ?
@Happypig375 I have read it - I'm thinking it over. There is a defaulting mechanism used for constructs in FSharp.Core, for example.
https://github.com/dotnet/csharplang/blob/main/proposals/csharp-12.0/collection-expressions.md We should definitely consider the equivalent C# feature here.
I propose we consider doing type-directed resolution of the
[...]
syntax if used with a non-list known type.The main thing here is that whenever a method needs a collection (of more or less any kind), as input, you can always use, for example,
[1;2;3]
and not think about it.This means F# APIs can reasonable choose to adopt ImmutableArray or other such collection types for inputs without particularly changing user code.
If this is considered reasonable, the question arises about when this is allowed. For example, it's possible that it should just be at method argument position - F# has a tradition of only applying such "magic" (e.g. lambda --> delegate,
ParamArray
and more recentlyop_Implicit
) only at method applications. However in theory it could also be at any position by type-directed resolution, e.g.For some reason people don't seem to mind these transformations at method calls.
From a previous suggestion:
The syntax would work with
This is somewhat related to the ListCollector type in F# 6 which is use as our builder for F# immutable lists and which gave the performance results described here. The only subtlety there is the presence of AddManyAndClose that allows lists to be collected in tail position without copying the list.
The underlying problem here is that we over-emphasise the "list" type in F# - that is, in an API designer wants to get nice, minimal callside syntax, they have no choice but to accept F# lists. However this has problems
list
throughout. but internally converts to arrays.Pros and Cons
The advantages of making this adjustment to F# are:
The disadvantages of making this adjustment to F# are:
[| 1;2 |]
and via known type([ 1; 2 ]: int array)
Extra information
Estimated cost (XS, S, M, L, XL, XXL): M
Related suggestions: (put links to related suggestions here)
Affidavit (please submit!)
Please tick this by placing a cross in the box:
Please tick all that apply:
For Readers
If you would like to see this issue implemented, please click the :+1: emoji on this issue. These counts are used to generally order the suggestions by engagement.