fable-compiler / Fable

F# to JavaScript, TypeScript, Python, Rust and Dart Compiler
http://fable.io/
MIT License
2.9k stars 295 forks source link

Handle escaping of { and } in FormattableString #3890

Closed roboz0r closed 1 week ago

roboz0r commented 1 week ago

Fixes #3889 by replacing {{ and }} with { and }. Adds tests to confirm correct unescaping.

roboz0r commented 1 week ago

Not sure how I missed this last night but this change achieves the expected output for JS template literals but it breaks an existing test:

    let s5: FormattableString = $"I have {{escaped braces}} and %%percentage%%"
    s5.Format |> equal "I have {{escaped braces}} and %percentage%"

There is a difference between _.Format and _.ToString()

open System
let s:FormattableString = $"""I have {{escaped braces}} and %%percentage%%"""
s.Format
s.ToString()

--- FSI Output ---

- s.Format;;
val s: System.FormattableString = I have {escaped braces} and %percentage%
val it: string = "I have {{escaped braces}} and %percentage%"

> s.ToString();;
val it: string = "I have {escaped braces} and %percentage%"

I am unsure whether it is simply I need to change the implementation of _.Format or if I have misunderstood the problem more fundamentally?

MangelMaxime commented 1 week ago

I am not sure what direction to take either. I never played deeply with string interpolation escaping or the custom GetStrings from Fable.

While looking for a fix my approach was different.

Instead of having a direct access to the underlying strs. What I did, is create a getStrings function in JavaScript which handles the call to cssNew.GetStrings().

My reasoning is it looks like the escaping problem only happens when using GetStrings for fixing it only for this call make sense to me.

And it also, fix it at a single place where your solution needs to modify the string/parts at 2 different places.

CleanShot 2024-09-18 at 18 21 18

What do you think of this approach ?

roboz0r commented 1 week ago

How does that approach affect the generated template literals? Specifically, the template literal shouldn't include {{

I think that StringTests.js should look like:

equal("I have {{escaped braces}} and %percentage%", getFormat(fmt_2`I have {escaped braces} and %percentage%`));
MangelMaxime commented 1 week ago

How does that approach affect the generated template literals? Specifically, the template literal shouldn't include {{

It doesn't affect it and the tests are all greens.

CleanShot 2024-09-18 at 18 34 49

I am not sure what we want as an output actually 😅

I would have liked if it was as simple as we want the user enter is what we want in the output. But it seems like it will be more difficult than that

let formatString0 : FormattableString = $"""{{}}"""
let formatString1 : FormattableString = $"""{{}}"""
let formatString2 : FormattableString = $$"""{}"""
let formatString3 : FormattableString = $$$"""{}"""
let formatString4 : FormattableString = $$$"""{{}}"""

Current implementation (compatible with my proposition)

export const formatString0 = fmt`{{}}`;
export const formatString1 = fmt`{{}}`;
export const formatString2 = fmt`{{}}`;
export const formatString3 = fmt`{{}}`;
export const formatString4 = fmt`{{{{}}}}`;

Your implementation

export const formatString0 = fmt`{}`;
export const formatString1 = fmt`{}`;
export const formatString2 = fmt`{}`;
export const formatString3 = fmt`{}`;
export const formatString4 = fmt`{{}}`;
MangelMaxime commented 1 week ago

Looking at Fable code, contrary to my primary fear it seems like String.get_Format is only used when working withFormattableString.

This is the only place I see it being used:

https://github.com/fable-compiler/Fable/blob/9cf4849fa3da4e7f53138aa468622f1b6987a2ca/src/Fable.Transforms/Replacements.fs#L1658-L1705

Something to not too, is that how we decide to represents output in this PR will probably not only have impact on JavaScript but also the others targets. Because it seems like they are using the same output format.

@roboz0r Is there a case where not using the current output format cause issue?

@ncave @dbrattli Any opinion on how we should handle FormattableString representation ?

roboz0r commented 1 week ago

I think that my implementation is the right way to go then. I don't think template literals require escaping of { so having

export const formatString0 = fmt`{{}}`;

Doesn't give the right semantics as

// JS
`{{}}` === "{{}}" // true
`{{}}` === "{}" // false
//F#
$"""{{}}""" = "{}" // true

This would be possibly breaking existing code but if existing template literals were always wrapped in fmt() then it wouldn't affect anything.

If we make the change then fmt wouldn't be required for js functions accepting template literals directly.

ncave commented 1 week ago

@MangelMaxime IMO we should try to keep string interpolation as similar to F# on .NET as possible: https://github.com/dotnet/docs/blob/main/docs/fsharp/language-reference/interpolated-strings.md#extended-syntax-for-string-interpolation

roboz0r commented 1 week ago

One of the things I don't like about doing unescaping in getStrings is it's causing runtime string allocations each time it is called. Ideally, we should just be passing through the runtime optimized TemplateStringsArray unchanged when passing a format string to a JS API.

Trying to stick closer to the .NET API makes me think we should reexamine the FormattableString type definition in String.ts

https://github.com/fable-compiler/Fable/blob/9cf4849fa3da4e7f53138aa468622f1b6987a2ca/src/fable-library-ts/String.ts#L575-L579

I sketched up a type that more closely matches the .NET type definition of FormattableString. I'm really not a JS dev so please change any non-idiomatic usage.

public abstract class FormattableString : IFormattable
{
    [StringSyntax(StringSyntaxAttribute.CompositeFormat)]
    public abstract string Format { get; }
    public abstract object?[] GetArguments();
    public abstract int ArgumentCount { get; }
    public abstract object? GetArgument(int index);
    public abstract string ToString(IFormatProvider? formatProvider);

    string IFormattable.ToString(string? ignored, IFormatProvider? formatProvider) =>
        ToString(formatProvider);

    public static string Invariant(FormattableString formattable)
    {
        ArgumentNullException.ThrowIfNull(formattable);
        return formattable.ToString(Globalization.CultureInfo.InvariantCulture);
    }

    public static string CurrentCulture(FormattableString formattable)
    {
        ArgumentNullException.ThrowIfNull(formattable);
        return formattable.ToString(Globalization.CultureInfo.CurrentCulture);
    }

    public override string ToString() =>
        ToString(Globalization.CultureInfo.CurrentCulture);
}

Concept

abstract class FormattableString {
    // Should this just be an interface?
    // Unsure about default naming conventions for JS/TS.
    abstract get Format(): string;
    abstract get Arguments(): any[] | undefined;
    abstract get ArgumentCount(): number;
    abstract getArgument(index: number): any;
    // toString() should return the formatted string. Currently in fable it returns [object Object]
    abstract toString(): string; // TODO: How to add IFormatProvider?
}

// Long Name. Consider "TemplatedString", "FableString", etc.
class TemplatedFormattableString extends FormattableString {
    readonly strs: TemplateStringsArray
    readonly fmts?: ReadonlyArray<string> // The constant portion of the string is truly immutable.
    readonly args?: any[] // The arguments are mutable, to match the .NET API.

    constructor(strs: TemplateStringsArray, fmts: string[], args: any[]) {
        super();
        this.strs = strs;
        this.fmts = fmts;
        this.args = args;
    }

    static createWithFmts(fmts: string[]) {
        return (strs: TemplateStringsArray, ...args: any[]) => {
            if (args.length !== strs.length - 1 || fmts.length !== args.length) {
                throw new Error("The number of format strings must match the number of expressions.");
            }
            let x = new TemplatedFormattableString(strs, fmts, args);
        }
    }

    static create(strs: TemplateStringsArray, ...args: any[]) {
        if (args.length !== strs.length - 1) {
            throw new Error("The number of format strings must match the number of expressions.");
        }
        return new TemplatedFormattableString(strs, [], args);
    }

    get Format(): string {
        // Inefficient, but it works to return the .NET style format string.
        // Consider caching the result in a private field.
        return this.fmts
            ? this.strs.map((s, _) => s.replace('{', '{{').replace('}', '}}'))
                .reduce((acc, newPart, index) => acc + `{${String(index - 1) + this.fmts![index - 1]}}` + newPart)
            : this.strs.map((s, _) => s.replace('{', '{{').replace('}', '}}'))
                .reduce((acc, newPart, index) => acc + `{${index - 1}}` + newPart);
    }

    get Arguments(): any[] | undefined {
        return this.args;
    }

    get ArgumentCount(): number {
        return this.args?.length ?? 0;
    }

    getArgument(index: number): any {
        return this.args![index];
    }

    private static applyFormat(format: string, arg: any): string {
        // TODO: Use the format string to format the argument.
        return arg.toString();
    }

    toString(): string {
        if (this.args === undefined) {
            return this.strs[0];
        } else {
            return this.strs.reduce((acc, newPart, i) => acc + (TemplatedFormattableString.applyFormat(this.fmts![i - 1], this.args![i - 1])) + newPart);
        }
    }
}

function fmt(strs: TemplateStringsArray, ...args: any[]) { return TemplatedFormattableString.create(strs, args); }
function fmtWith(fmts: string[]) { return (strs: TemplateStringsArray, ...args: any[]) => TemplatedFormattableString.createWithFmts(fmts)(strs, args); }

Usage

let speedOfLight = 299792.458  
let frmtStr = $"The speed of light is {speedOfLight:N3} km/s." : FormattableString

Becomes

const speedOfLight = 299792.458;
const frmtStr = fmtWith([":N3"])`The speed of light is ${X_speedOfLight} km/s.`;

let color = "blue"
Lit.css 
        $""":host {{
            color: {color};
        }}"""

Becomes

function css(strs: TemplateStringsArray, ...args: any[]) {
    // As imported by e.g. lit
}

const color = "blue";
const s = fmt`:host {
            color: ${color};
        }`;
// Default path for template literals has no extra string allocations provided we unescape "{{" to "{" at compile time.
return css(s.strs, s.args);
MangelMaxime commented 1 week ago

We avoid creating classes in fable-library so bundler have a chance to remove dead code. I don't know if now days they can do it for classes or not.

Trying to stick closer to the .NET API makes me think we should reexamine the FormattableString type definition in String.ts

What we mean by that in general, is that the behaviour at runtime should be same. Not necessary that we use a class if they use a class, etc.

MangelMaxime commented 1 week ago

Note, I just said that we avoid using class in fable-library but working on another fix I just found that we do use class sometimes 😅.

So let's say it depends and can be used.

@roboz0r I merged the PR as is because it seemed to fix your original issue and from what I understood using a class would just re-structured but with the same behaviour.

If I was wrong feel free to tell me and we can merge another PR.

roboz0r commented 1 week ago

@MangelMaxime Thanks for your help getting this one completed. The class version didn't change any behaviour but was just trying to better capture the semantics of the .NET API. I'll trust your judgement whether to go with a class def or leave the interface as is.

While I was working on it, I did identify that FormattableString.ToString() needs an implementation to produce the formatted string instead of returning [object Object]. That can be a separate issue/task.

MangelMaxime commented 1 week ago

Yes, there a few places where if you use printfn "%A" myValue then you get [object Object].

But if you use JS.console.log(myValue) you often get something more useable. I never really dig this issue and always assume it was a limitation to how some types are represented or how Fable handle printing values. But if we can improve it that's good :)

Thanks for the work done in this issue, I want to fix a few other ones before making a release but should not take too long.