neslib / Neslib.Xml

Ultra Light-Weight XML Library for Delphi
Other
54 stars 14 forks source link

Problem reading a space value #7

Closed fschenckel closed 3 years ago

fschenckel commented 3 years ago

Hi,

I'm not sure to consider this as a bug or as a feature :-), but a single space from a text element is never read (Write is OK).

Ex: `<?xml version="1.0" encoding="UTF-8"?>

` MyElement text value will always be read as an empty string. I've fixed this by changing the line 1015 in Neslib.Xml.IO.pas like this: ` if (P^ > #$1F {#$20}) then` I understand the check you wanted to do here and in most cases it's ok but not if there's a single space. I could not read in the XML specs that such a case is forbidden, so I think the single space should be read ? Regards Frédéric
neslib commented 3 years ago

Hi Frédéric,

This is a design decision. As far as I know, there is no official rule for how whitespace should be handled. Some libraries (like mine) ignore whitespace, while others preserve all whitespace, and still others make it configurable through a property or settings.

The primary goal of my library is to keep the XML document in memory as small as possible. Preserving whitespace would add additional nodes to the tree which is undesirable in most cases (since this is not real content). It could make traversing the tree harder, especially if the single space wasn't originally intended to be inside the XML document.

Maybe I'll add a WhitespaceHandling property or something in the future that you can use to configure how whitespace should be handled. For now, I would like to keep things simple though.

Of course, you are free to modify the library for your own needs or fork it. You could also try to implement this WhitespaceHandling property yourself and create a pull request. In that case, you will have to update the unit tests as well so they work both with and without preserving whitespace.

Hope this helps, Erik

fschenckel commented 3 years ago

Hi Erik,

OK, this is more or less what I expected. The only thing which is not really consistent is that it's only in case of single space. For example ' test' or 'te st' or 'test ' are correctly retrieved. This is why it's a little disturbing. But of course this is not a big deal !! I don't know in details the WhitespaceHandling property but this would probably need to keep or remove also the spaces in these examples... I will take a look regarding the modification needed concerning the implementation of this property. Thanks ! Frédéric