ixmilia / dxf

MIT License
221 stars 67 forks source link

Invalid XDATA control string leads to partial read of input file #166

Closed stijnherreman closed 3 years ago

stijnherreman commented 3 years ago

I have a file containing a number of entities with invalid XDATA control strings:

...
1001
redacted
1002
{
1040
17.31
1000

1040
0.0
1070
     0
1070
     0
1002
{
  0
INSERT
  5
...

The second control string value should be } instead of {. The reader ends up nesting again and again, ultimately swallowing the remainder of the file and not throwing any exception. The loaded DxfFile is then missing a bunch of objects.

Some other applications, such as AutoCAD and QCAD, appear to ignore this incorrect data and load all objects. This library however is strict in expecting a }:

https://github.com/ixmilia/dxf/blob/7149f0b11384c2dc44a0de1f28e0e322dc11c115/src/IxMilia.Dxf/DxfXData.cs#L328

Do you think there is room for the reader to be less strict? Or is nesting XDATA a valid scenario? It's a tricky situation. An end user reported missing data, but it takes a developer to step through the reading of the file with a debugger.

brettfo commented 3 years ago

Good catch! This is certainly something I didn't see happening. The spec isn't terribly clear on whether XDATA lists can be nested or not. I've decided to air on the side of caution and allow it, but to fix this scenario I've added commit dbd84552bd787859d97cbba88833954c0cef7109 to main that should allow your file to be read. When the second 1002/{ pair is encountered, another XDATA list is started, but the very next 0/INSERT pair causes everything to finalize, so the result of parsing your sample would be (pseudo-json):

{
  "ENTITY": {
    "XDATA": {
      "redacted": [
        { "real": 17.31 },
        { "string": "" },
        { "real": 0.0 },
        { "short": 0 },
        { "short": 0 },
        { "list": [] } // <-- this is an empty list caused by the bad 1002/{ pair
      ]
    }
  }
}
stijnherreman commented 3 years ago

Thanks for the quick update! That seems to do the trick, the file loads correctly now.