Open ericdrobinson opened 1 year ago
I'm glad you find this repository helpful. I'll try to address your issue ASAP. You can watch the repo for new changes or star it.
I have just confirmed that this line:
const parseXml = function(xmlData) {
xmlData = xmlData.replace(/\r\n?/g, "\n"); //TODO: remove this line
is responsible for the bug. (Verified by stepping through with a debugger.)
I have also determined that the reason the online tool doesn't exhibit this issue is because the XML string returned by browser node access APIs (used internally by CodeMirror) returns carriage return characters as \\r
instead of \r
.
Depending upon the purpose of the problematic line of code that I identified, this may actually cause a bug in the online tool.
Regardless, this is clearly not operating as expected.
this seems to only happen when trimValues is turned on, at least in my use case, I get a lot more new lines from the text node when I turn it on.
can you check with v5 if it works?
is v5 not available as an npm module? How do I install it and where do I get it?
This issue is also affecting us as we need to parse XML files which do not encode carriage returns. We can't really move to v5 because we are also building XML files which isn't supported yet (I guess we could parse with v5 and build with v4 but that seems a bit cumbersome). I know that replacing all CRLFs or CRs with LFs is in the XML spec, but it would be nice to be able to turn this off (at least within values, between tags newlines are ignored anyway). In the meantime as a workaround we are manually encoding all CRs before calling parse and then decoding them in the tagValueProcessor which may lead to bugs in some strange edge cases.
@yelly , I can understand the trouble you're experiencing. But unfortunately, I'll not be able to do any changes in the code for next 2-3 months. I would take v5 ahead for new changes and build the other components like validator, builder etc for v5. I just wanted to take a feedback on v5 before I do any big change.
will it be fixed in v4?
Description
Carriage Return characters (
\r
) found within parsed XML are incorrectly converted to newline characters (\n
).I did a quick scan of the source code and found a likely culprit:
That pretty clearly replaces any
\r
character (and possibly\r\n
pair) with\n
.The online tool does not exhibit this issue because browser node access APIs encode Carriage Return "strings" (the character
\
followed byr
) as\\r
. The regular expression no longer matches. See:Input
The XML that exhibits this issue is of the form:
Code
The code is pretty straightforward.
After that code runs, the parsed text node content has
\n
instead of the expected\r
.Output
Running the above results in the following JSON:
Expected Data
I expect the following output:
Would you like to work on this issue?