dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.37k stars 4.75k forks source link

System.Text.Json - Add overloads for reading a Guid using a custom "standard format" #30692

Open khellang opened 5 years ago

khellang commented 5 years ago

Rationale

Currently, it's a bit of a hassle to properly read a Guid using a custom format. You basically have to copy the implementation of TryGetGuid just to (more or less) change a single character (the format):

https://github.com/dotnet/corefx/blob/7055b496a30dfe0f66a2f555cad31502473d144b/src/System.Text.Json/src/System/Text/Json/Reader/Utf8JsonReader.TryGet.cs#L942-L993

And because JsonReaderHelper is internal, you also have to copy the implementation of TryGetEscapedGuid:

https://github.com/dotnet/corefx/blob/7055b496a30dfe0f66a2f555cad31502473d144b/src/System.Text.Json/src/System/Text/Json/Reader/JsonReaderHelper.cs#L315-L339

Proposal

I'd :heart: to see the following overloads added to System.Text.Json:

namespace System.Text.Json
{
    public readonly partial struct JsonElement
    {
        public System.Guid GetGuid() { throw null; }
+       public System.Guid GetGuid(char standardFormat) { throw null; }
        public bool TryGetGuid(out System.Guid value) { throw null; }
+       public bool TryGetGuid(char standardFormat, out System.Guid value) { throw null; }
    }
    public sealed partial class JsonString : System.Text.Json.JsonNode, System.IEquatable<System.Text.Json.JsonString>
    {
        public bool TryGetGuid(out System.Guid value) { throw null; }
+       public bool TryGetGuid(char standardFormat, out System.Guid value) { throw null; }
    }
    public ref partial struct Utf8JsonReader
    {
        public System.Guid GetGuid() { throw null; }
+       public System.Guid GetGuid(char standardFormat) { throw null; }
        public bool TryGetGuid(out System.Guid value) { throw null; }
+       public bool TryGetGuid(char standardFormat, out System.Guid value) { throw null; }
    }
}

This would make it really easy to write a custom converter to work around dotnet/runtime#1567:

public sealed class JsonConverterGuid : JsonConverter<Guid>
{
    public override Guid Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options)
    {
        if (reader.TryGetGuid('D', out Guid value))
        {
            return value;
        }

        if (reader.TryGetGuid('N', out value))
        {
            return value;
        }

        throw new FormatException("The JSON value is not in a supported Guid format.");
    }

    public override void Write(Utf8JsonWriter writer, Guid value, JsonSerializerOptions options)
    {
        writer.WriteStringValue(value);
    }
}

Notes

// @ahsonkhan @layomia @steveharter

layomia commented 4 years ago

From @stylesm in https://github.com/dotnet/runtime/issues/767:

System.Text.Reader.Utf8JsonReader.TryGetGuid(out Guid value)

calls

System.Buffers.Text.Utf8Parser.TryParse(ReadOnlySpan<byte> source, out Guid value, out int bytesConsumed, char standardFormat = default)

However, when it does, it passes 'D' in as the standardFormat parameter, meaning only Guids in the format nnnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn are parsed successfully.

There are 4 Guid formats supported by Utf8Parser:

        ///     D (default)     nnnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn
        ///     B               {nnnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn}
        ///     P               (nnnnnnnn-nnnn-nnnn-nnnn-nnnnnnnnnnnn)
        ///     N               nnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn

It would be great if Utf8JsonReader.TryGetGuid(out Guid value) supported the other three formats, defaulting to 'D' for backwards compatibility.

This bubbles up into other APIs, e,g, System.Text.Json

Am happy to take on dev work if this is agreed.

Am in a rush and haven't checked contrib guidelines so likely update this later.

layomia commented 4 years ago

From @idzmitry in https://github.com/dotnet/runtime/issues/31627:

Here is a code snippet (F#)

open System
open System.Text.Json

[<CLIMutable>]
type Test = {
    Id: Guid
    Text: string
}

[<EntryPoint>]
let main _ =
    let str =
        {| Id = "88f871231cee49e4bcfff12252d90410"; Text = "abc" |}
        |> JsonSerializer.Serialize

    //works fine
    Guid.Parse "88f871231cee49e4bcfff12252d90410"
    |> printfn "Guid: %A"

    //fails
    JsonSerializer.Deserialize(str, typedefof<Test>)
    |> printfn "Record: %A"

    0

Guid.Parse works fine at same time .Net Core 3.1

layomia commented 4 years ago

A workaround here is to use a custom converter to read/write with Guid.Parse/ToString, or for better performance Utf8Parser/Formatter.TryParse/TryFormat. For examples, see similar implementations for DateTime here.

stylesm commented 4 years ago

A workaround here is to use a custom converter to read/write with Guid.Parse/ToString, or for better performance Utf8Parser/Formatter.TryParse/TryFormat. For examples, see similar implementations for DateTime here.

But then we have to do this for every parse, whereas there is an existing Guid parsing method in Utf8JsonReader, which can be made extended but backwards compatible due to its default 'D' format value, and which then makes life easier for a lot of other libraries.

I don't think we should expect devs to write custom parsers to read Guids.

khellang commented 4 years ago

I don't think we should expect devs to write custom parsers to read Guids.

It's only if you're using a non-standard Guid format 😄

But then we have to do this for every parse, whereas there is an existing Guid parsing method in Utf8JsonReader, which can be made extended but backwards compatible due to its default 'D' format value, and which then makes life easier for a lot of other libraries.

This issue is about adding these APIs to make it easier to parse non-standard Guid formats, using Utf8JsonReader. Are you saying you want to make the existing parse method(s) try all different formats when parsing? That would probably be a non-trivial performance hit.

stylesm commented 4 years ago

No, I'm suggesting we could add a format parameter with a default value. That would keep it backwards compatible and then default behaviour would be identical to what it is now. The performance hit would be negligible because further downstream we are already using a format parameter with a default value.

The standard Guid format is only standard in the .NET scope - plenty of other frameworks use different formats, and combining stacks is rather common. It is so common that the Guid parser in .NET supports these other formats.

If we were super concerned about performance then we could overload the method, meaning the original is unaffected.

khellang commented 4 years ago

No, I'm suggesting we could add a format parameter with a default value. That would keep it backwards compatible and then default behaviour would be identical to what it is now.

That's exactly what this issue is proposing though? 🤔

The standard Guid format is only standard in the .NET scope - plenty of other frameworks use different formats, and combining stacks is rather common.

No, it's absolutely not. The canonical 8-4-4-4-12 format string is based on the record layout for the 16 bytes of the UUID, as specified by RFC4122. The RFC even contains an example implementation for this format.

stylesm commented 4 years ago

No, I'm suggesting we could add a format parameter with a default value. That would keep it backwards compatible and then default behaviour would be identical to what it is now.

That's exactly what this issue is proposing though?

Are you arguing against your own proposal? Im confused by what you're suggesting (not trying to be awkward!) Adding an optional parameter with a default value won't cause a big performance hit.

The standard Guid format is only standard in the .NET scope - plenty of other frameworks use different formats, and combining stacks is rather common.

No, it's absolutely not. The canonical 8-4-4-4-12 format string is based on the record layout for the 16 bytes of the UUID, as specified by RFC4122. The RFC even contains an example implementation for this format.

You're right, that by standards, it had a default format. The existence of different format options in .net is evidence enough that others use different formats, and that other formats are de facto standards for their own frameworks.

khellang commented 4 years ago

Are you arguing against your own proposal? Im confused by what you're suggesting (not trying to be awkward!) Adding an optional parameter with a default value won't cause a big performance hit.

I'm asking what your comment was about, since I don't see what value it added if it was basically saying exactly what the proposal says 😅

You're right, that by standards, it had a default format. The existence of different format options in .net is evidence enough that others use different formats, and that other formats are de facto standards for their own frameworks.

Sure, and if you're using something non-standard, I think it's only reasonable that you have to tweak the (de-)serializer to fit your needs, i.e. write a custom converter 😄

Anyway, @layomia's comment was a workaround, not a solution. This issue tracks the API additions needed to tweak the behavior.

JinsPeter commented 3 years ago

Can someone add a CustomConverter that works? I tried and stopped wondering

layomia commented 3 years ago

From @AntonIOIOIO in https://github.com/dotnet/runtime/issues/59474:

Description

When you call GetGuid() from JsonElement the only valid format is 00000000-0000-0000-0000-000000000000?

Why is that?

https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/Node/JsonString.cs

The code is hardcoded to "D". public Guid GetGuid() => Guid.ParseExact(_value, "D");

layomia commented 3 years ago

We should consider this issue a general one for extended Guid support across the System.Text.Json types - serializer, reader, writer, DOMs etc.

eiriktsarpalis commented 3 years ago

A concern I have about the proposal is that the new Guid overloads in Utf8JsonWriter/Utf8JsonReader:

  1. Push more serialization concerns to the Utf8JsonWriter/Utf8JsonReader layer.
  2. Don't directly resolve the motivating use case: users still need to author and register a custom converter that calls into the new methods.

I'm wondering if instead it might make more sense to:

  1. Make the converter more lenient by default (accept both 'D' and 'N' formats and potentially others). Would need to measure potential perf regressions here.
  2. Add APIs to Utf8JsonWriter/Utf8JsonReader that make authoring the proposed Guid overloads practical (read: zero allocation) via user-defined extension methods. This is already possible on the writer side and we're planning on similar functionality on the reader via https://github.com/dotnet/runtime/issues/54410.
LeaFrock commented 1 year ago

As I read in #77020, this issue seems won't be solved in .NET 8?

davhdavh commented 1 year ago

It is quite silly considering that most other places, N variants of GUIDs work just fine. E.g. PageModel parameters, or [FromQuery] properties

JustArchi commented 8 months ago

Still an issue as of .NET 8, I'd love to see this getting more attention, as I don't see any real reason why only standard format of Guid is supported.

internal sealed class GuidJsonConverter : JsonConverter<Guid> {
    public override Guid Read(ref Utf8JsonReader reader, Type typeToConvert, JsonSerializerOptions options) {
        try {
            return Guid.Parse(reader.GetString()!);
        } catch {
            // Throw JsonException instead, which will be converted into standard message by STJ
            throw new JsonException();
        }
    }

    public override void Write(Utf8JsonWriter writer, Guid value, JsonSerializerOptions options) {
        writer.WriteStringValue(value.ToString());
    }
}

This custom converter does the trick, even if naturally it could be done better with underlying implementation.

nicolashemery commented 8 months ago

Hello,

I've the same issue to use FromBodyAttributes on AzureFunction. For now i'm using @JustArchi solution (thanks). But i agree with the global spirit of that thread, that the serializer by default should support more format option, or to allow parametrization.