neuecc / Utf8Json

Definitely Fastest and Zero Allocation JSON Serializer for C#(NET, .NET Core, Unity, Xamarin).
MIT License
2.36k stars 267 forks source link

Use `JsonUtf8Encoding : Encoding` #17

Open neuecc opened 6 years ago

neuecc commented 6 years ago

Escaping string character is hurt of performance of JSON serialization. It is possible to reduce escape cost by creating custom UTF8 Encoding that includes JSON encoding/decoding. for invoke internal FastAllocateString, it is necessary to inherit Encoding.

public class JsonUtf8Encoding : Encoding
    #region decode(for reader)

    // (Encoding.GetString) -> GetCharCount -> (FastAllocateString) -> GetChars

    public override int GetCharCount(byte[] bytes, int index, int count)
        // return CharCount is \" (.+) \", (.+) group unescaped.
        if (bytes[index] != '\"') throw new InvalidOperationException();

        throw new NotImplementedException();

    public override int GetChars(byte[] bytes, int byteIndex, int byteCount, char[] chars, int charIndex)
        throw new NotImplementedException();


    #region encode(for writer)

    // should use GetByteCount? too large?

    public override int GetMaxByteCount(int charCount)
        return Encoding.UTF8.GetMaxByteCount(charCount) * 2; // worst case, escaped.

    public override unsafe int GetBytes(string s, int charIndex, int charCount, byte[] bytes, int byteIndex)
        int byteCount = bytes.Length - byteIndex;

        fixed (char* pChars = s)
        fixed (byte* pBytes = bytes)
            return GetBytes(pChars + charIndex, charCount, pBytes + byteIndex, byteCount);

    public override unsafe int GetBytes(char* chars, int charCount, byte* bytes, int byteCount)
        throw new NotImplementedException();


    public override int GetBytes(char[] chars, int charIndex, int charCount, byte[] bytes, int byteIndex)
        throw new NotSupportedException();

    public override int GetByteCount(char[] chars, int index, int count)
        throw new NotSupportedException();

    public override int GetMaxCharCount(int byteCount)
        throw new NotSupportedException();

Also, it is necessary to implement efficient UTF 8 encoding/decoding. I found this article. If there are any other good examples, please let me know.

neuecc commented 6 years ago

@itn3000 is trying fast utf8 <-> utf16 utilities.

@ufcpp is building custom UTF8 decoder.

NStack is golang like new encoding system.

System.Text.Utf8String is span based new primitive.

Tornhoof commented 6 years ago

Regarding utf-8: and related from

penguinawesome commented 4 years ago

hi @neuecc we badly need your help, do you have an idea or workaround for our issue?