ballsteve / xrust

XPath, XQuery, and XSLT for Rust
Apache License 2.0
84 stars 7 forks source link

Unrecoverable parser error while parsing (valid) XML #94

Open mickdekkers opened 1 month ago

mickdekkers commented 1 month ago

Hi,

I encountered an error when attempting to parse this XML file: https://github.com/mickdekkers/xrust-parse-issue-repro/blob/main/data/content.xml

As far as I can tell, the file is valid XML. I'm currently migrating my project over to the xrust crate for the XPath features (thanks, by the way!). In my project, I previously used xot and roxmltree before this, so I can confirm those crates do parse this XML file without issue. The online XML validators I found also confirm the file is valid.

I prepared a repo with a minimal reproducible example: https://github.com/mickdekkers/xrust-parse-issue-repro

The error occurs with the code on main as well as dev. The repro points to dev.

Would appreciate it if you could take a look! 😄

Thanks!

mickdekkers commented 1 month ago

Hmm, strange... it doesn't seem to be the specific XML file that's the issue here. If I replace the XML file contents in the repro with that of https://github.com/ballsteve/xrust/blob/main/examples/issue-30.xml, I still get the error.

At first glance, my repro code doesn't look meaningfully different from the relevant part of the example code either, so I'm a bit lost 😅 https://github.com/ballsteve/xrust/blob/313aafd523b1f9fbfdfc4b961e6d7fe6b61c5db3/examples/issue-30.rs#L64-L66

Devasta commented 1 month ago

Hi @mickdekkers ,

Thank you for the bug report! I'm looking into this now.

ballsteve commented 1 month ago

Hi Mick,

I'll take a look at the problem, and will also tag Daniel Murphy on it too.

Cheers, Steve

Devasta commented 1 month ago

image

The parser is tripping on a byte order mark in the file before the XML declaration.

I'll get a fix in for this in the next release.