fsharp / fslang-suggestions

The place to make suggestions, discuss and vote on F# language and core library features
345 stars 21 forks source link

[Discussion] More consistent and more simple syntax #523

Closed 21c-HK closed 7 years ago

21c-HK commented 7 years ago

Inspired by the recent syntax discussion in the thread How to drive F# Adoption - Part 4, I created this thread that is dedicated to discussions on making the F# syntax more consistent and more simple. I had difficulty naming this thread, because I did not want to simply name it "Improve syntax" since this thread should not be about (more or less arbitrary) syntax changes like "Can't we replace this keyword/operator/bracket with that keyword/operator/bracket?". Instead, this thread should be geared towards (more or less objective) consistency and simplification of the syntax like "There is the following unnecessary syntactic inconsistency, which might be confusing to F# users and/or complicate the F# compiler".

The suggestion to make the F# syntax more consistent and more simple is in line with a few desirable factors of suggestions listed in Notes on the Design Process of fslang-suggestions:

  • education/learning paths and simplicity
  • does this give multiple ways to achieve the same thing
  • design coherence
  • "less is more" design considerations

I give a few examples at the bottom to make clear what I mean. I (or you) might post these examples as individual issues on fslang-suggestions if other people agree that these would be worthwhile changes. I am aware that "we simplified the syntax and compiler" is not as sexy or impressive as "wow, look at this new feature", but it would be a worthwhile long-term investment that would help F# adoption since it would reduce the learning costs of every F# user.

Note that I chose to post here because I thought that it might be a good idea to gather a few inconsistencies first to see if we can find a common themes like verbose vs. lightweight below before posting individual suggestions.

I would like to suggest that if there are two syntactic ways to do exactly the same thing (see below for examples), then we should decide on one of them to be the only way and officially deprecate the other. Ideally, a future version of F# would not longer support the deprecated syntax and all references to it would be removed from the official documentation. I imagine that this would simplify the compiler considerably and improve the learning experience of F#.

Examples of inconsistent syntax

  1. The option between verbose syntax and lightweight syntax is a clear example of two ways of expressing exactly the same thing that might be extremely confusing to a beginner. Especially when reading F# code on various blog posts or projects where you encounter both lightweight syntax and verbose syntax (e.g. older projects or blog posts). Tooltips and F# interactive in Visual Studio 2015 and earlier always display verbose syntax. It essentially forces to the reader to have to know both syntaxes! Then there are also a few cases where there is no equivalent lightweight syntax. While there is lightweight syntax for defining interfaces with at least one member, verbose syntax must be used to define an empty interface (e.g. marker interface) and empty class (e.g. non-constructable type) because there is no lightweight equivalent. On the other hand, there is only (lightweight) syntax for defining abstract classes, which require an attribute (AbstractClassAttribute) but there is no attribute for interfaces or classes, which could be used to define an empty interface or empty class.

So syntax consistency could be improved by deprecating verbose syntax and adding equivalent consistent lightweight syntax where required via attributes for interfaces and classes similar to the existing attributes for struct and abstract class.

  1. There are two ways of expressing type parameters. E.g. a' option (OCaml style) vs. Option<'a> (.NET style).

  2. Structs are declared and used differently from records. E.g. Struct declaration requires val for fields and constructor can only be defined with new-keyword, but cannot be pattern matched like other structural types and have a default constructor that is accessible from F#. This issue will fortunately be fixed with the upcoming "struct records", so I consider this a feature that leads to more consistent and more simple syntax.

  3. Too many type aliases. E.g. List = list, Option = option, float = double, float32 = single seem interchangeable, but they are not. For example, units of measure do not work on these type aliases (i.e. need to use float32 instead of single). I also once had an issue where List was somehow different from list, but I cannot recall the details. Since the type aliases are probably going to stay, we need to ensure that types and type alias are truly interchangeable.

  4. I find it really strange that you can use ^ and ' for statically resolved type parameters in explicit member constraints interchangeably most of the time (but not always), because the F# compiler will automatically infer/translate ' to be ^. It gets confusing when the F# compiler requires an additional space before the first type parameter when you use ^, but not when you use '. Here is a code example:

// does not compile because of missing space after <
type GenericClass<^type_parameter when ... >(value : ^type_parameter) ... 
// compiles because of additional space after <
type GenericClass< ^type_parameter when ... >(value : ^type_parameter) ... 
// interprets ' as ^ and compiles despite missing space after < 
type GenericClass<'type_parameter when ... >(value : ^type_parameter) ... 
  1. Inconsistent allowance of with-keyword depending on indentation. I don't recall the details because I have gotten used to a certain style. I will update this issue with a concrete example when I encounter this again.

  2. It does not matter whether statically resolved type parameters of explicit member constraints are tupled or curried at declaration, but the compiler only resolves the member when it is tupled in the member implementation. I will update this issue with a concrete example another time.

These were just a few examples from the top of my head. All these examples boil down to one too many syntactic ways to do exactly the same thing.

dsyme commented 7 years ago

Thanks for the constructive suggestion list.

To engage with some of these points:

The option between verbose syntax and lightweight syntax is a clear example of two ways of expressing exactly the same thing

Essentially no one uses the verbose syntax, or extremely rarely statistically speaking, except for single-line expressions like let f() = let x = 2 in x + x. I never see # light "off" in learning material. Please find or add an issue to deprecate it or make it only available with a more explicit compatibility switch (there may be an issue already), and people can discuss,

Tooltips and F# interactive in Visual Studio 2015 and earlier always display verbose syntax.

This is a legacy bug and should be fixed - lightweight syntax should be used. Please find or add an issue on http://github.com/Microsoft/visualfsharp, or even better just submit a PR to address this.

There are two ways of expressing type parameters. E.g. a' option (OCaml style) vs. Option<'a> (.NET style).

The first is generally only used for list, option and array (int[]) types, and can't be used for multi-parameter types. This means it doesn't really get used much, though some people use it for some additional single-parameter types though it's not stylistically normative.

Please find or add a suggestion issue to find ways to continue to make the second more normative, and people can discuss.

but there is no attribute for interfaces or classes, which could be used to define an empty interface or empty class

This is true, though defining empty classes of interface is exceedingly rare in real code. Please add a suggestion to allow something like

[<Class>]
type C 

[<Interface>]
type I

or

[<Class>]
type C() = begin end

For example, units of measure do not work on these type aliases (i.e. need to use float32 instead of single).

Increasingly single and double are normative in F# code.

This issue should be fixed - it is the only place I know of where float32 and single are not interchangeable - if there are others we should dig them out and fix them. AFAIK it can fixed fairly easily through the addition of a type alias.


open FSharp.Data.UnitSystems.SI.UnitSymbols

type single<[<Measure>] 'U> = float32<'U>
type double<[<Measure>] 'U> = float<'U>

type x1 = single<kg>
type x2 = single
type y1 = double<kg>
type y2 = double

let z1 : y1 = 10.0<kg>
let z2 : x1 = 10.0f<kg>

I also once had an issue where List was somehow different from list, but I cannot recall the details. Since the type aliases are probably going to stay, we need to ensure that types and type alias are truly interchangeable.

I also once had an issue where List was somehow different from list, but I cannot recall the details.

This is almost certainly because you had open System.Collections.Generic which defines the .NET List type (a mutable array list). It's a problem where two views of the world collided (immutable functional programming and mutable OO collections) both claiming priority. Better error messages are probably the way forward here.

It does not matter whether statically resolved type parameters of explicit member constraints are tupled or curried at declaration

Yes, IIRC there's a call to flatten the argument lists somewhere, which should ideally be removed and replaced by a warning. Could be tricky to implement without breaking code however.

lambdakris commented 7 years ago

As someone who considers himself a beginner I think I might be able to provide some perspective on some of these suggestions.

On no. 1, I actually did not stumble too much as a result of this. I thought I would and I understand the rationale of why it would feel like an arbitrary choice, but I guess enough of the materials I used while learning F# stuck to the lightweight syntax that I emulated that quite naturally and when it showed up in intellisense it did not feel that weird. I think part of the reason is that the verbose and the lightweight syntax do feel like they are stylistically consistent with each other, meaning that most of the time the lightweight syntax felt like a natural abbreviation of the verbose syntax as opposed to a situation where you have two completely different styles of syntax to express the same concept.

On no. 2, when first getting into F#, I thought that the 'a option style was the more "functional" style and also read better since as a reader, I felt like I could just read a string option whereas Option<string> was something that I had to parse out in my head to read "an option of/with the type param string". However, I do recognize how the <T,...> style is necessary for multi-parameter types, and I did just fine with that syntax for years in C# so there is no legit cause for complaint. I guess I would just put forward my support for consolidating to <T,...> so that such a trivial thing as how to express parametricity does not become a debate in the learners head and does not get proliferated in other code bases further distracting the learner.

On no. 3, I do feel a bit of friction when working with structs, but I admit that personally, I use them so rarely that most of the time I feel fine dismissing that friction. However, I find attributes to be cumbersome to write and read, which precisely intrudes on the flow that F# achieves over C#, which is that it is cheap to express things and structurally suggestive to perceive them. So I admit that I would prefer not to see them become the go to technique to specialize type declarations.

On no. 4, this is also a "pebble in the shoe" type thing. It does not in any way prevent me from being successful with F#, but it is distracting and continues to be distracting even after repeated exposure. It is not the number of type aliases, rather, I feel a bit stranded when the alias abstraction seems to break down. For example, in general, I get disoriented when I give something a name in one context and it is displayed as something else in some other context (this is similar to my qualms with tuples and the sharing of [] among list literals and array signatures). In those cases where an alias is not acceptable as a param, I almost always assume it is a mismatched type and go down that road of inquiry instead of remembering the particular constraints or nuances of aliases. Fortunately, this does not strike me as an actual change in syntax, rather just some tightening up of the aliasing feature.

So in short, I support turning 2 and 4 into proper lang suggestions since from my own experience, I think they would yield immediate benefits. I can see the point of 1 and 3 but from my own experience they would not have as immediate an impact as 2 and 4. As to 5, 6, and 7, they are somewhat outside my comfort zone so I will refrain from commenting on them.

piaste commented 7 years ago

The first is generally only used for list, option and array (int[]) types, and can't be used for multi-parameter types.

Am I missing something? I've used the ML-style generics with multiple type params just fine. (Like lambdakris, I also find them more readable than the .NET style, though it's not a huge difference).

The following two styles emit identical IL (proof: https://www.diffchecker.com/m0pVd071)

type ('a, 'b) generic1 when 'a : equality = {
    Foo : 'a
    Bar : 'b
}

let x1 : (string, int) generic1 = { Foo = "a"; Bar = 1 }
type Generic1<'a, 'b when 'a : equality> = {
    Foo : 'a
    Bar : 'b
}

let x2 : Generic1<string, int> = { Foo = "a"; Bar = 1 }
abelbraaksma commented 7 years ago

@piaste, I think you are referring to type declarations, whereas @dsyme was referring to parameter declarations: let foo (a: list option) = .... That's what he meant with "and can't be used for multi-parameter types".

On no. 4, this is also a "pebble in the shoe" type thing. It does not in any way prevent me from being successful with F#, but it is distracting and continues to be distracting even after repeated exposure.

@lambdakris, I wholeheartedly agree. And I don't quite understand what @dsyme means that they are fully interchangeable. If they were, than this wasn't different:

image

image

I know, the first is the type alias and the second is the module (at least I think so). But even after playing around with F# on and off for about 5 years now, I still wouldn't readily know the answer for such differences. Sometimes it is because the lower-case variant is an alias to the type in BCL (double, int), sometimes it is not an alias but its own type. To make it even more confusion: int is both a type and function.

But even as a type, int and Int32 are NOT the same as what I would consider a true type alias. If it was, why is this different?

image

image

Even the tooltip help says "it is an abbreviation of System.Int32. But you cannot use int where you can use Int32.

image

If I understand @dsyme correctly he'd like to erase any differences that still preside. But I think these differences come from a different reason: int the type and int the function.

The difference is even more striking with Double which has more static methods and the surprise effect for people moving from C# to F# will therefore be bigger. Once you learn that you should use float or double instead of Double you will have unlearn that the minute you want to use the static method, properties and fields. I.e., how can you explain to a beginner, or even an advanced programmer, why let d: double = 1.0 and let d: Double = 1.0 are equal (and double is an alias for Double), but let d = double.MaxValue is invalid while let d = Double.MaxValue is valid.

To add to the list of curiosities: if you create your own alias, this problem does not arise. You will get all the methods the original type had.

piaste commented 7 years ago

@piaste, I think you are referring to type declarations, whereas @dsyme was referring to parameter declarations: let foo (a: list option) = .... That's what he meant with "and can't be used for multi-parameter types".

@abelbraaksma, that doesn't seem to be the case either. let foo (a : string list option) = Some [""] works, and so does let foo (a : _ list option) = None if you want to leave the type parameter unspecified.

I thought @dsyme might be referring to multiple levels of parameters, but that works fine too, e.g.:

let x : ((string, int) generic1, (int, float) generic1) generic1 = 
  { Foo = { Foo = "foo"; Bar = 1 }
    Bar = { Foo = 11; Bar = 1.0 }  }

On the subject of primitive types:

To make it even more confusion: int is both a type and function. But even as a type, int and Int32 are NOT the same as what I would consider a true type alias. If it was, why is this different?

As you pretty much guessed in the previous sentence, the issue arises from int being a function as well as a type. If you just type int., Intellisense treats it as a function and gets the function object's members (.GetType and .ToString) rather than the type's.

If you prefix it with the full namespace you get the type alias's members:

screenshot from 2016-12-24 12-20-27

abelbraaksma commented 7 years ago

@piaste: I agree to your assessment on int being both type and function (and a function is itself a type). But the subject of this whole discussion is "more consistent and more simple syntax", esp. w.r.t. new users.

Since I don't think we can expect new users to use the full namespace (I don't think we can expect it from advanced users either) and since it is hardly ever required to access a function as a Type from F#, it seems reasonable to fix this at the level of the user-interface, more specifically, by changing the priority for finding an object's or type's members.

Perhaps a PR that would simply fixed this by putting the the definition of int (function) prior in load order to int (the type alias) so that the type alias shadows the former? Or would that wreak havoc existing code bases (I think not, they will either use Int32.Parse out of frustration, and I doubt you will ever see int.GetType())?

(after-throught/edit: this may not work, existing code could have let f = int, which would then be changed to let f = Int32, which is not the same, surprise suprise...)

Tarmil commented 7 years ago

Regarding ML-style vs C#-style syntax for type parameters, I am also in support for making C#-style the standard across the board. One thing that would help in this regard is if the compiler always gave messages and tooltips in this syntax, instead of using 'T list as shown @abelbraaksma's screenshots.

rmunn commented 7 years ago

It looks like I'll be the first to speak out in favor of the ML-style syntax ('T option instead of Option<'T>) remaining the default for options (and, possibly, for lists, but see below) — because it's actually closer to C# syntax and would, IMHO, be easier to learn. Don't believe me? Consider this: in the C# code you've seen, when someone wants to make a value type nullable, which of these two syntaxes do they write?

public class Item {
    public DateTime? Created { get; set; }
    public Guid? Id { get; set; }
    public string Data { get; set; }
}

Or:

public class Item {
    public Nullable<DateTime> Created { get; set; }
    public Nullable<Guid> Id { get; set; }
    public string Data { get; set; }
}

In my own experience, the code bases I've worked with have a lot more of the DateTime? syntax than the Nullable<DateTime> syntax. In fact, I just typed Nullable<DateTime> foo = null; into the C# code I have open right now in VS Code, then hovered over the variable name. And the Omnisharp Intellisense popup that I'm looking at says DateTime? foo.

So for options, which translate directly to Nullable<T> in a C# developer's mind (until he learns about the subtle differences between the two), the T option syntax will be more familiar to a C# developer, IMHO, than the Option<T> syntax.

As for lists, there's a different argument. First, there's the similarity to arrays: a C# developer knows that arrays are declared as int[] foo, so when he types let arr = [|1; 2; 3|] ;; into F# Interactive and sees val arr : int [] = [|1; 2; 3|] come back, the int [] syntax will feel familiar. As it should. Then when he types let lst = [1; 2; 3] ;; and sees val lst : int list = [1; 2; 3], the similarity to arrays will be rather clear.

Of course, val lst: list<int> = [1; 2; 3] would also be extremely familiar to our hypothetical C# coder, because he's used System.Collections.Generic.List before. But now he's going to trip over the fact that System.Collections.Generic.List is now called ResizeArray, and that the list<int> type is totally different from the List<int> type that he used to know in C#. Here, the similarity of names will (again, IMHO) be confusing. Whereas if he sees int list vs. ResizeArray<int>, those two will look more differentiated to him. And if he is then told "The C# type called List has been renamed to ResizeArray in F# to avoid confusion with its built-in list type", he will have the frame of reference to understand that.

So I'm personally in favor of keeping the ML-style syntax for at least 'T option and 'T list, and displaying those types in the ML-style syntax by default type in the F# Interactive REPL, the Intellisense plugins in various IDEs, and so on. With, perhaps, an option in the compiler service (that plugins like Ionide could then expose in configuration) to switch to the C#-style syntax for people who really do prefer Nullable<T> instead of T?, and who will therefore feel more familiar with Option<'T> and list<'T>. But for these two, at least, I believe there's a good case to be made that the ML-style syntax will be easier for C# coders to recognize.

dsyme commented 7 years ago

I plan on closing this old discussion - I would be grateful if people could add specific suggestions where necessary

It's ok (and encouraged) to open discussion threads on particular general topics, but we will close them after they have been dormant for a while

Thanks

texastoland commented 3 years ago

I'm also new with experience from other MLs. A few of the above are still head scratchers for me and a couple more not mentioned yet.

Increasingly single and double are normative in F# code.

I didn't even realize single and double existed reading through the Language Reference (for example).

Suggestion: Search and replace usages in docs.

The first is generally only used for list, option and array (int[]) types, and can't be used for multi-parameter types.

I read the same thing in the Style Guide but it's perplexing. Why treat 't type (ML style) vs Type<'T> differently than verbose (ML style) vs lightweight syntax? It was probably the first thing I googled coming from OCaml to understand any differences.

Suggestion: Deprecate ML style convention from remaining types.

(this is similar to my qualms with tuples and the sharing of [] among list literals and array signatures).

It was barely mentioned but how did Type[] end up being an array signature while [value] is a list literal? I guess you get used to it but confusing early on.

Suggestion: Make new Type[||] sugar for lists ... kidding. It's missing from the Language Reference though.

Inconsistent allowance of with-keyword depending on indentation.

I stumbled on an example:

// with
type List<'T> = Nil | Cons of 'T * List<'T> * int with
  member this.Length =
    match this with
    | Nil -> 0
    | Cons(_, _, length) -> length

// without
type List'<'T> =
  | Nil
  | Cons of 'T * List<'T> * int
  member this.Length =
    match this with
    | Nil -> 0
    | Cons(_, _, length) -> length

Suggestion: Perhaps this is a bug?

These were just a few examples from the top of my head.

I found this issue trying to understand the syntax for members:

                  let   privateField             = expr
                  let   privateMethod       arg  = expr
                  val   uninitializedField
abstract                VirtualProperty                 with get, set
abstract member         VirtualMethod      (arg)
         member   val   ImplicitProperty         = expr with get, set
         member       _.ReadOnlyProperty         = expr
         member       _.ExplicitProperty                with get      () = expr
                                                        and       set v  = expr
         member       _.SealedMethod       (arg) = expr

Intuition:

Suggestion: If it were possible to collapse member val to member and abstract member to abstract (maintaining backwards-compatibility) my columns would collapse into a single keyword per concept and be useful in documentation:

let        privateField             = expr
let        privateMethod       arg  = expr
val        uninitializedField
abstract   VirtualProperty                 with get, set
abstract   VirtualMethod      (arg)
member     ImplicitProperty         = expr with get, set
member   _.ReadOnlyProperty         = expr
member   _.ExplicitProperty                with get      () = expr
                                           and       set v  = expr
member   _.SealedMethod       (arg) = expr

@dsyme Are any of these worth a new issue?

Better error messages are probably the way forward here.

I agree that would make the single biggest impact. I've already struggled with errors porting code from OCaml. Elm's messages are often cited but I really liked ReScript's too.