ReClassNET / ReClass.NET

More than a ReClass port to the .NET platform.
MIT License
1.81k stars 372 forks source link

Invalid unicode/hexadecimal values during save. #8

Open elementofprgress opened 7 years ago

elementofprgress commented 7 years ago

Recently had an issue where a unicode char accidentally ended up in the title of a class pointer. When attempting to save the program would error: "System.ArgumentException: '', hexadecimal value 0x1F, is an invalid character."(http://pastebin.com/ALdGLN1m)

Reclass would remain open but with just knowing there was a invalid char somewhere and not where made it difficult to find. (I didn't realize at the time I could unzip the attempted save and see what class it cut off)

Reclass could still generate class files.

Tried filtering out unicode chars during the AddText function, which worked but the char remains in the stream that writes the XML. Leading me to let you know about the issue.

KN4CK3R commented 7 years ago

Where was the unicode char? In the title of the class or in a member name? Ok, I can reproduce this error if I add "\x1F" to a name and save the class. Don't really know why the .NET XML library can't handle this character but I will try to find a workaround.

elementofprgress commented 7 years ago

The generated class that contained the invalid char was class N000030E8 { public: char pad_0000[8]; //0x0000 class _AAAAAAA_* m_142A7D2F8; //0x0008 char pad_0010[568]; //0x0010 }; //Size: 0x0248

I know the XML language in general has issues with certain chars. - https://www.w3.org/TR/REC-xml/#charsets

I tried a few variations of VerifyXMLChars(https://msdn.microsoft.com/en-us/library/system.xml.xmlconvert.verifyxmlchars(v=vs.110).aspx) but nothing that worked completely.

Maybe because it was fixing strings it outside of System.Xml.Linq...which I'm not familiar with what so ever.

KN4CK3R commented 7 years ago

Why do you get these unicode chars in the names?

KN4CK3R commented 7 years ago

Possible solution:

XmlConvert.DecodeName(XmlConvert.EncodeName("\x1F")) == "\x1F"

But I don't like the idea to add this stuff the export class because every plugin developer would need to add these methods to his export/import classes for custom nodes. These invalid characters shouldn't get into the names at the first place. This is where this error should be fixed.

elementofprgress commented 7 years ago

Oh, it was a total fluke that the chars got into the name. Fumbled something near my keyboard and random keys got mashed and I didn't know anything was there until I went to save. Imagine this issue would be ...very rare. Far more then the impact of the issue it self, I think weirdness was worth noting. I've seen your reclass handle Cyrillic and Japaneses characters and didn't expect .net to react this way.

I think making a change that would force every plugin developer to update over a issue this minor might be over kill. Outside of random flukes like mine, or maybe not paying attention to copy pasting name? These chars are something you need to intentionally go out of you way to enter. Actually, even difficult to intentionally reproduce the error.

I haven't looked into how complicated it would be, but your error handling is normally spot on. Particularly handling importing other reclass files. Would handling the exception with a message that similar to the import diagnostic be an easy fix? or an error that included the name of the class its in or offset in a class. Then the user can fix their user error and just re-save once removed.

On a related side note, The reclass file I lost was....quite large. Large enough to where Reclass would shutter for a moment every time I named or changed the name of a class as it tried to updated the left side panel. Luckily, I got pretty much everything I needed out of it before losing it, but before closing I should have really taken a moment and looked at the partial save and/or how Reclass saves files. Only minutes after closing I realized it's just a zip and how easy it is to get to the xml. Where you see exactly where the stream ends/errors out and could have easily just gone to that class, fix it and saved. Oh well, lessons learned I guess. Though hopefully it helps someone at some point to know if they manage to break their own file, its easy to fix before closing Reclass.

KN4CK3R commented 7 years ago

I could filter the user input for name and comment fields so a user can't use invalid characters. This will not prevent plugin authors to use invalid characters but they should know what they are doing.

An other possible solution would be to use an other file format. JSON doesn't have problems with the character. But that would make the old ReClass.NET files incompatible.

{/*ReClass.NET by KN4CK3R*/
  "reclass": {
    "@version": "1",
    "@type": "x86",
    "classes": {
      "class": {
        "@uuid": "/7OHcbDt60uv1SLeUxpzEw==",
        "@name": "N0000000A\u001f", /* Here is your \x1F character */
        "@comment": "",
        "@address": "400000",
        "node": [
          {
            "@name": "N0000000B",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N0000000C",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N0000000D",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N0000000E",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N0000000F",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000010",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000011",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000012",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000013",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000014",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000015",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000016",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000017",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000018",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N00000019",
            "@comment": "",
            "@type": "Hex32Node"
          },
          {
            "@name": "N0000001A",
            "@comment": "",
            "@type": "Hex32Node"
          }
        ]
      }
    }
  }
}