pchampin / sophia_rs

Sophia: a Rust toolkit for RDF and Linked Data
Other
214 stars 23 forks source link

Work-in-progress RDF/XML parser #9

Closed althonos closed 5 years ago

althonos commented 5 years ago

Hi!

This is a work in progress branch, so it's not clear to merge right now, but this is how far I am currently with an RDF/XML parser. I'm parsing correctly most all of the RDF/XML 1.1 examples, and I need to add the tests from rdf-test.

I'll also feature-gate the parser behind an XML feature since it requires an additional dependency (quick-xml).

Missing features

althonos commented 5 years ago

cc @phillord if you want to see where this is going

Tpt commented 5 years ago

Just to make sure you are aware: rudf provides a simple RDF/XML parser: https://github.com/Tpt/rudf/blob/master/lib/src/rio/xml.rs

phillord commented 5 years ago

@Tpt Also got this now https://github.com/phillord/raptor-rs

althonos commented 5 years ago

@Tpt : mine however supports parseType="collection" :wink:

althonos commented 5 years ago

Finally, this is feature-complete ! I still need to do a bit of refactoring, in particular to reduce code duplication and complexity, but this version behaves correctly against the RDF/XML test suite, including errors (it fails where a failure is expected). The only exception is the parseType="Literal" feature, which is not supported by the underlying quick-xml library, but I opened an issue in there to request that.

Streaming (i.e. iterating over the produced triples and unwraping the result) the Gene Ontology (go.owl) takes about 5~10 seconds on my machine, but there is probably still some optimisations to be carried out.

pchampin commented 5 years ago

I know that last commit cost you :wink:, much appreciated. I'll take over to please Travis, and do the merge. Thanks again

althonos commented 5 years ago

@pchampin : my only regret is not being able to add better error reports, I should have used xmlparser instead of quick-xml to have input spans and better error reporting, I may try to experiment to see how easy it is to replace it ! :wink: