ForeverZer0 / SharpNBT

A pure CLS-compliant C# implementation of the Named Binary Tag (NBT) format specification commonly used with Minecraft applications, allowing easy reading/writing streams and serialization to other formats.
MIT License
25 stars 9 forks source link

StringNbt.Parse😱😱😱 System.Data.SyntaxErrorException:“Unrecognized sequence at index 233: '''” #14

Closed NingLiu1998 closed 10 months ago

NingLiu1998 commented 11 months ago

image

why ?

coode:

string testNbtStr3 = "{Count:1b,id:\"minecraft:netherite_sword\",tag:{Damage:0,Enchantments:[{id:\"minecraft:looting\",lvl:3s},{id:\"minecraft:smite\",lvl:5s},{id:\"minecraft:sweeping\",lvl:3s}],RepairCost:7,display:{Name:'{\"extra\":[{\"text\":\"我是修改的名字\"}],\"text\":\"\"}'}}}";
CompoundTag? tags = StringNbt.Parse(testNbtStr3);
NingLiu1998 commented 11 months ago

image

The translation feature comes from ChatGPT3.5. I briefly looked at the source code, and it seems that there is no regular expression matching for handling such cases

ForeverZer0 commented 11 months ago

Do the strings only contain UTF-8 compatible characters?

The exception is at index 233, which is the string with non-Latin characters..

NingLiu1998 commented 11 months ago

字符串是否仅包含 UTF-8 兼容字符?

例外情况是索引 233,它是具有非拉丁字符的字符串。

Converted directly from Minecraft servers,Converted directly from Minecraft servers

I would like to know, how to construct such an NBT structure?

ForeverZer0 commented 11 months ago

I took some time to pick apart the input, and this appears to be a bug with the library. For some reason either the inner JSON or the empty string is messing up the lexer state.

I will prioritize this project, it has been lacking proper maintenance on my part, and is past due for some improvements and taking advantage of modern language features.

NingLiu1998 commented 11 months ago

I took some time to pick apart the input, and this appears to be a bug with the library. For some reason either the inner JSON or the empty string is messing up the lexer state.

I will prioritize this project, it has been lacking proper maintenance on my part, and is past due for some improvements and taking advantage of modern language features.

expectancy (*❦ω❦)

ForeverZer0 commented 11 months ago

@NingLiu1998 Which version of .NET are you using? Ideally I would like to break away from .netstandard, as it is less and less relevant, and really restricts what features are available to use. This library was originally written in the pre-.NET5 days.

I was otherwise considering a complete re-write, and simply try to keep the same API surface.

NingLiu1998 commented 11 months ago

您使用的是哪个版本的 .NET?理想情况下,我想摆脱 .netstandard,因为它越来越不相关,并且确实限制了可以使用的功能。这个库最初是在 pre-.净5天。

否则,我正在考虑完全重写,并简单地尝试保持相同的API表面。

dotNet7

NingLiu1998 commented 11 months ago

This is for my side project, so I'll just go with the highest version I can always use, maybe soon I'll be using dotNet 8

ForeverZer0 commented 11 months ago

I will likely be targeting .NET 7 with the latest major version of the language. I honestly don't feel this project is big enough to be worrying about breaking compatibility with Mono-based projects, or some enterprise company using a legacy code-base based on .NET Framework 4.7, etc.

This will allow me to make some significant improvements to the lexer, as well as correcting the current bug. The overwhelming majority should maintain the same API surface, I will ensure it will not require a huge refactor, a few minor adjustments at most.

This allows me to eliminate my custom ZLib and VarInt implementations, arguably the most tedious aspect of this project, as these are now included in the runtime.

NingLiu1998 commented 11 months ago

I will likely be targeting .NET 7 with the latest major version of the language. I honestly don't feel this project is big enough to be worrying about breaking compatibility with Mono-based projects, or some enterprise company using a legacy code-base based on .NET Framework 4.7, etc.

This will allow me to make some significant improvements to the lexer, as well as correcting the current bug. The overwhelming majority should maintain the same API surface, I will ensure it will not require a huge refactor, a few minor adjustments at most.

This allows me to eliminate my custom ZLib and VarInt implementations, arguably the most tedious aspect of this project, as these are now included in the runtime.

expectancy (*❦ω❦)

ForeverZer0 commented 10 months ago

I started a rewrite, but ended up reverting back to just fixing the current project.

I did implement an entirely new SNBT parser from scratch, which works far better, and is much faster. It It doesn't use Regex, and does a single-pass tokenization/parse in one step.

I even copy-pasted your example and made an additional test out of it, which was successful.

(my font doesn't support those Unicode characters, so they may have gotten messed up copying)

        [Fact]
        public void ParseBugged()
        {
            var testString = "{Count:1b,id:\"minecraft:netherite_sword\",tag:{Damage:0,Enchantments:[{id:\"minecraft:looting\",lvl:3s},{id:\"minecraft:smite\",lvl:5s},{id:\"minecraft:sweeping\",lvl:3s}],RepairCost:7,display:{Name:'{\"extra\":[{\"text\":\"我是修改的名字\"}],\"text\":\"\"}'}}}";
            var tag = StringNbt.Parse(testString);
            output.WriteLine(tag.PrettyPrinted());
        }
TAG_Compound(None): [3 entries]
{
    TAG_Byte("Count"): 1
    TAG_String("id"): "minecraft:netherite_sword"
    TAG_Compound("tag"): [4 entries]
    {
        TAG_Int("Damage"): 0
        TAG_List("Enchantments"): [3 entries]
        {
            TAG_Compound(None): [2 entries]
            {
                TAG_String("id"): "minecraft:looting"
                TAG_Short("lvl"): 3
            }
            TAG_Compound(None): [2 entries]
            {
                TAG_String("id"): "minecraft:smite"
                TAG_Short("lvl"): 5
            }
            TAG_Compound(None): [2 entries]
            {
                TAG_String("id"): "minecraft:sweeping"
                TAG_Short("lvl"): 3
            }
        }
        TAG_Int("RepairCost"): 7
        TAG_Compound("display"): [1 entry]
        {
            TAG_String("Name"): "{"extra":[{"text":"我是修改的名字"}],"text":""}"
        }
    }
}

I will be creating a new release within the next day that has the fixes in it.

ForeverZer0 commented 10 months ago

@NingLiu1998

This has been fixed in Version 1.3.0. The package has been uploaded to NuGet as well, though as of the time I write this, is still going through the validation process.

I will close the issue for now, but feel free to reopen or create a new issue if your problem persists.

NingLiu1998 commented 10 months ago

Has Stringify() of NBTTag been changed?

Before I was through the Stringify () to obtain NBT string, now I see an updated version after Stringify () returns a string, not the NBT I want to get NBT string what should I do now

@ForeverZer0

NingLiu1998 commented 10 months ago

image

NingLiu1998 commented 10 months ago

My scenario requirement was to be able to convert the JSON string back to the original NBT string after modifying part of the value


My scenario requirement is to be able to convert the json string back to the original NBT string after modifying some of the JSON string values

NingLiu1998 commented 10 months ago

❗❗❗❗❗❗

Or there's another way of thinking

Is there any way I can change the attribute value of the CompoundTag object and then go back to the NBT string?

Instead of worrying about the JSON conversion, I just need the front end to return my data, and my back end to read the original NBT string into the CompoundTag object (stringntbt.parse), then make the attribute value judgment changes, and then return to the original NBT string

ForeverZer0 commented 10 months ago

I just looked, and I introduced a small bug when redoing the Stringify logic for compound tags (probably ListTag also). I changed it from splitting into an array of strings and then re-joining after modification, to simply use a StringBuilder, to try and use only one heap allocation.

When adding child tags, I mistakenly forgot to call the "Stringify" on children tags, so it is appending the child itself to the StringBuilder, which then just calls ToString on it.

This:

        foreach (var value in dict.Values)
        {
            if (i++ > 0)
                sb.Append(',');
            sb.Append(value);
        }

...should be this:

        foreach (var value in dict.Values)
        {
            if (i++ > 0)
                sb.Append(',');
            sb.Append(value.Stringify());
        }

I will make a hot-fix and push it out tomorrow.

23

ForeverZer0 commented 10 months ago

@NingLiu1998

i actually had some spare time to fix and make a release immediately. It appears to all be correct and work as expected in the tests, and should be updated in NuGet by the time you read this.

NingLiu1998 commented 10 months ago

Thank you for responding so quickly ❕

❗❗❗ There's still a problem, though, with the ':' symbol being redundant? 🤔

image

ForeverZer0 commented 10 months ago

Ugh, that will teach me for rushing....

It appears to have something to do with the implicit "top-level" tag, which has no name. Should be a simple fix it with an additional check if the tag actually has a name before writing it.

There is an optional argument for manually specifying whether the tag name should be written. Does it do this for:

var snbt = compoundTag.Stringify(topLevel: true, named: false);

?

NingLiu1998 commented 10 months ago

thank you

string nSNBT = Ntag.Stringify(false,false);

this result is Ok!