instant-labs / instant-xml

11 stars 3 forks source link

Add basic CDATA support #50

Closed djc closed 9 months ago

djc commented 9 months ago

Alternative to #49. This pushes decoding down from the impls into the deserializer context, which allows us to skip decoding for CDATA sections.

These changes (probably the refactor of decode() itself) seem to regress performance on the benchmarks for escaped data while improving performance a little bit, which I'd like to better understand, but otherwise this is in a decent state.

One other follow-up would be to properly handle concatenated text nodes like foo<![CDATA[bar]]>baz. Not sure how common these are in real world RSS...

djc commented 9 months ago

Dropping the Keep all decode state in the enum commit restores performance:

That's a good idea -- I thought I needed this to make the types work out but that's not actually true. Dumped it out of this PR for now, will maybe investigate later if I can make a version that improves performance.