KWARC / rust-libxml

Rust wrapper for libxml2
https://crates.io/crates/libxml
MIT License
76 stars 38 forks source link

Improve Error Reporting for Schema Validation #116

Closed JDSeiler closed 1 year ago

JDSeiler commented 1 year ago

I have some prior Rust experience, but I'd consider myself an intermediate Rust programmer at best, and I have very little experience with C and FFI. Point being, I'm very open to feedback on these changes.

Motivation

Closes #115

Previously, StructuredError was a thin wrapper around a raw pointer to libxml2's xmlError struct. This was problematic because libxml2 does not allocate separate xmlError structs. Instead, it uses an effectively global (or thread-global) xmlError that is rewritten (but does not move) every time a new error is generated. As a result, all of the errors produced by this library were all pointing to the same memory and all had the same contents: the last error libxml2 produced.

The following code from libxml2 was consulted to confirm its behavior:

Description of Changes

I replaced the wrapper around the raw pointer with a more traditional Rust struct so that each individual error could be preserved. Because the struct no longer has any ties to the underlying C-managed data once constructed, the Drop implementation was removed.

Not all of the fields in the xmlError struct had obvious utility, or even safe ways of managing them. The fields ommitted are:

Regarding enums, the xmlErrorLevel field was converted to a normal Rust enum because it's small. The code and domain fields were included in their raw forms because they might be useful to someone. However, the xmlErrorDomain and xmlParserError C enums are 30 and 700+ (!!) members respectively, so it didn't seem wise to write out Rust enum versions of them.

Draft -> Ready for Review Checklist

dginev commented 1 year ago

@JDSeiler thank you for the thorough work in this PR, and thanks to @imcsk8 for the unexpected review. This is already more than the usual PR burden on this wrapper repo, I appreciate the contributions. Maybe it is a sign of the crate starting to mature.

As to the questions by @JDSeiler :


Aside: What actually stuck out to me is libxml's decision to add trailing \n newline chars to the error values. I am a bit uneasy about exposing those from the Rust struct, without first running a .trim() on them, especially if we are allocating a String anyway. But I could see this going either way.

JDSeiler commented 1 year ago

I left a small type suggestion, and will wait for a final nod from @imcsk8 , but we're pretty much good to merge and ship a v0.3.3 here.

Thanks again @JDSeiler

Awesome, thank you (and @imcsk8 !) for feedback on the changes. This was a great learning experience!

imcsk8 commented 1 year ago

... I am more than happy to recruit co-maintainers, and have others speak into the versioning roadmap.

I'll be happy to help maintaining. What do you need me to do?

dginev commented 1 year ago

@imcsk8 basically what you just did in reviewing here, whenever you have time/interest.

I am currently available enough to ship minor releases, but I have been a little easy-going with spending time in developing the crate. Btw this is also how @triptec got added in here some time back, when he had a season of libxml work.

And since this is currently a care-free open source crate you can also just sit on the rights without doing anything, it keeps the crate safer in case I'm missing for whatever reasons.

dginev commented 1 year ago

@imcsk8 and in the interest of speed, I just added you as an admin to the repo (you should get an invite).

Feel free to merge this PR in as a warm-up, if the general setup sounds reasonable enough.