fsharp / fslang-suggestions

The place to make suggestions, discuss and vote on F# language and core library features
345 stars 21 forks source link

Possible plain-text formatting improvements #897

Open dsyme opened 4 years ago

dsyme commented 4 years ago

I'd like to start a thread documenting possible improvements to plain-text formatting in F#. This is driven partly my doing a deep look at getting quality plaintext printing for F# in .NET Interactive Notebooks

The current state of plain-text formatting in F# is documented in these articles:

The possible adjustments I've spotted are:

  1. Be more consistent about using globalization settings (default to InvariantCulture)

    In all formatting (both %A and F# Interactive), be more consistent about globalization settings for things of type IFormattable. In particular, arbitrary .NET values currently format using CurrentCulture settings, where numeric values do not. This is inconsistent

  2. Find a way to allow printf formatting to use user-defined culture settings. This is partly possible with interpolated strings using FormattableString, but that can't use %d, %A etc. patterns.

  3. Respect DebuggerBrowsableAttribute DebuggerBrowsableState.Never in F# Interactive printing to automatically suppress the printing of some properties. .NET Interactive does this

  4. Allow multi-line %A printing to respect indentation. So for example sprintf "fooo %*A goo" v would put the format of v indented 5 spaces if it is multi-line. Currently multiline %A printing doesn't respect indentation (see here and this makes outputs look much messier unfortunately).

    Further. some limited "break if necessary specifications" may be allowed, so something like sprintf "tensor [%40*4A]" v might allow a line break to be inserted if the value hits 40 characters width. These are really useful though not entirely suited to specification in printf strings.

    Alternatively we could expose a variation of the box layout code available for us internally in FSharp.Core as an API independent of sprintf.

  5. Implement ToString() on F# function and other closure values so they do more than print the obscure closure name - as least add some indication the thing is a function

  6. Allow %f, %A and friends to optionally take an additional specifier to force CurrentCulture, e.g. %$d, %$A (or some other character). There could also be a corresponding option to force InvariantCulture, e.g. %!d, %!A

    // %$f ---> Use the culture from context (UI or FormattableString formatting operation)
    // %!f ---> Use invariant culture 
    // %f == %!f
    
    // %$A ---> For IFormattable, use the culture from context (UI or FormattableString formatting operation)
    // %!A ---> For IFormattable, use invariant culture 
    // %A  != %!A
    // %A  != %$A
    
    // %$O ---> For IFormattable, use the culture from context (UI or FormattableString), otherwise ToString()
    // %!O ---> For IFormattable, use invariant culture, otherwise ToString()
    // %O == %$O
  7. Allow %A (or some API access point to the same functionality) take a (varying) line width as a parameter, e.g. %*A or have %A respect Console.WindowWidth by default

  8. Allow %A (or some API access point to the same functionality) to take a specifier that disables PrintLength for strings and collections.

  9. Allow %A (or some API access point to the same functionality) to take a specifier that gives colorized output

  10. Give a warning when a mix of culture-aware and culture-invariant format specifiers are used in a single interpolated string.

One overall aim may be to allow for the satisfactory implementation of a generic print using

/// Generic unstructured culture-invariant single-line string
let string x = ...  

/// Generic structured culture-invariant multi-line colorized
/// output fitting to console width
let print x = ...  

/// Generic structured culture-invariant multi-line colorized
/// output fitting to console width with new line
let printn x = ...           

Plus this:

module CurrentCulture =
    /// Generic unstructured culture-aware single-line string
    let string x = ...  

    /// Generic structured culture-aware multi-line colorized
    /// output fitting to console width
    let print x = ...  

    /// Generic structured culture-aware multi-line colorized
    /// output fitting to console width with new line
    let printn x = ...  ```

This would raise questions about whether CurrentCulture module would have functions like int and double for parsing strings using current culture (the default int and double parse using invariant culture).

cartermp commented 4 years ago

Implement ToString() on F# function and other closure values so they do more than print the obscure closure name - as least add some indication the thing is a function

This one would be good. Currently it's not so bad:

> let add2 x y = x + y;;
val add2 : x:int -> y:int -> int

> let add1 = add2 1;;
val add1 : (int -> int)

> add1 12;;
val it : int = 13

But when you're evaluating the function itself it gets weird:

> let square x = x * x;;
val square : x:int -> int

> square;;
val it : (int -> int) = <fun:it@2>

> let add2 x y = x * y;;
val add2 : x:int -> y:int -> int

> add2 1;;
val it : (int -> int) = <fun:it@4-1>

> type R = { SquareF: int -> int; AddF: int -> int -> int };;
type R =
  { SquareF: int -> int
    AddF: int -> int -> int }

> let r = { SquareF = square; AddF = add2 };;
val r : R = { SquareF = <fun:r@16>
              AddF = <fun:r@16-1> }

Would you propose writing the body of the function to the output?

All other proposals feel fine. Though it is worth pointing out that despite %A and friends only really being intended by structured format display, people use them to generate data in running apps. So any change will be a breaking change.

abelbraaksma commented 4 years ago

In particular, arbitrary .NET values currently format using CurrentCulture settings, where numeric values do not. This is inconsistent

It's true that it's inconsistent, including for numeric values. Interpolated strings are currently defined in terms of %O evaluation, which uses CurrentCulture, while %A uses InvariantCulture:

> string 12.3;;
val it : string = "12.3"   // invariant culture

> sprintf "%A" 12.3;;
val it : string = "12.3"   // invariant culture

> sprintf "%O" 12.3;;
val it : string = "12,3"   // current culture

> @$"{12.3}";;
val it : string = "12,3"   // current culture
dsyme commented 2 years ago

Some information from https://github.com/dotnet/fsharp/pull/13597#issuecomment-1216742406, mostly reconfirming what's written above.

The design intent of FSharp.Core functionality has always been "use invariant culture unless explicitly specificed otherwise".

For visual outputs from F# Interactive, the user can specify fsi.FormatProvider but the default is InvariantCulture. An argument could be made that the default should be localized, but it isn't and I don't think we should change that now.

I'd encourage people to review https://docs.microsoft.com/en-us/dotnet/fsharp/language-reference/plaintext-formatting and contribute to it. There should really be a specific separate section on locales - there are mentions in the doc but a separate section should cover the above.

Happypig375 commented 2 years ago

and why didn't we use string x instead of x.ToString() for interpolation?

dsyme commented 2 years ago

and why didn't we use string x instead of x.ToString() for interpolation?

It's a good question.

String interpolation emerged as a combination of C#-style string interpolation and F# printf. Two of the design goals were:

This unfortunately led to the contradiction, since the first goal meant "culture-aware" formatting for holes not using %, and the second goal meant "culture-invariant" formatting for holes using %. It's one of many examples where two different worlds with very different assumptions collide in the design of F# (OCaml/Haskell/Python languages default to culture-invariant, Java/VB languages default to culture-aware)

I think the best path forward to reconcile these two universes (and deal with the inconsistency of %A) is to add the explicit specifiers for culture-aware and culture-invariant formatting to what's available via % formats. This at least means there is a consistent path to explaining what's going on. And if reviewing code using %A and %d you can ask the person submitting the code to clarify their intent about culture (without just removing the annotations).

Finally, it's odd that culture-formatting didn't come up in design review or code review for string interpolation. We've added a section to the RFC template to ask people to consider this.

Happypig375 commented 2 years ago

Implicit culture formatting of string interpolation arguably is a pitfall and many people do not expect their code to output different results in different cultures. Can't we fix this "bug" instead of keeping it as avoidance of breaking changes?

dsyme commented 2 years ago

@Happypig375 I believe we can't adjust that now. String interpolation has been available about 2 years now.

abelbraaksma commented 2 years ago

Finally, it's odd that culture-formatting didn't come up in design review or code review for string interpolation.

I did a lot of reviews on string interpolation and testing the new code (tbh, some of these bugs are still out there, but tbth, they are all rather minor). But yeah, I totally missed that too, except for a few fixes I sent out for tests in F# itself that assumed US culture (and failed on my nl-NL system). Which comes to show how trivial it is to make such mistakes in code.

At the time I added tests to specifically test the new culture-variability and (for other functions) culture-invariability. Others may have done so too, which further establishes this behavior as a fait-accompli.

Looks like I was just as surprised as any man and missed it in the RFC as well, originally: https://github.com/dotnet/fsharp/issues/10030.

In general, these things only surface once your code gets an international team, or once your libs are run outside of your own home.

It would be great if we could add warnings to F#, together with adding culture-specifiers as suggested here, such that users are becoming aware of this. It is my understanding that most F# users think that library code is idempotent, regardless where you run it. But w.r.t. cultures it isn't (anymore).

abelbraaksma commented 2 years ago

Following up on a discussion on Slack, it appears there’s want for a “strip the leading whitespace” in multilingual strings. This is currently only possible with \ in multilingual strings. C# has this feature for triple quoted strings. Whether such construct should keep line endings, I don’t know.

Such a feature would certainly help keeping close nicely indented in our beloved whitespace sensitive language, as currently such multiline strings have to awkwardly be outdented.