pchampin / sophia_rs

Sophia: a Rust toolkit for RDF and Linked Data
Other
214 stars 23 forks source link

changed in error handling and iterator semantics #27

Closed MattesWhite closed 4 years ago

MattesWhite commented 4 years ago

This PR changes the error-handling of sophia and some semantics of XSource and XSink traits. I know, I shouldn't do two different things in one PR but I couldn't resist ...

Change in error-handling

This PR canges the error-handling in sophia from error-chain and coercible-errors to thiserror and anyhow.

Specific errors

As mentioned in #8 is the error-chain-crate somewhat outdated. I choose the relatively new thiserror as it is very simple and easy to use. In addition the big sophia::error::Error is split into module-errors as suggested by snafu. This will make it easier to split up sophia (#26) and reduces dependencies between the modules. Furthermore, it is someaht more aligned with the philosophie of "each implementation can provide its own error".

Coliding error-types

This was handled by coercible-errors before and is changed to anyhow. It provides anyhow::Error which is "a better Box<dyn Error>". It can be created from anytype that implements std::error::Error, therefore, is suitable when different error-types colide.

As stated in anyhow's docs can anyhow::Error be downcasted to a concrete error-type, so users can still match against concrete types, e.g. done in the parser test.

I don't want to offend coercible-errors. I just think the handling of many different errors will become more and more complicated as the sophia-ecosystem grows as implementers probably provide their own error-types. The complexity has been presented at graph::Graph::retain() which had a bizzar and complicated trait bound and now just returns anyhow::Error.

The merge-everything anyhow::Error is only used when two or more errors are used within one function. Everything else returns a specific error-type. This highlights better where the user has to be aware of error-mixing.

Semantics of Iterators

When "Triples" are mentioned in the following section they can always be substituted with "Quads"

The semantics of TripleSource and TripleSink always felt a bit cumbersome to me. As a result, I changed the semantics of them a bit.

Consume sources

Often Iterators where past in by &mut but Iterators are designed to be consumed by iteration, therefore, it is more idomatic to move them into a function. Accordingly, I added TripleSink::feed_all() method that takes a TripleSource.

The semantic of a TripleSink is also not totaly clear to me. On the one hand there is the 'serializer'-sink that consumes and finishes just like TripleSource's API suggests. On the other hand there are 'graph'-sinks that consume triples as-well but finishing them seems a bit unnatural. As a consequence, I removed the Graph::inserter and Graph::remover methods which are equally substituted by Graph::insert_all and Graph::remove_all.

Create from TripleSource

The builder-pattern for graphs seemed a bit unidomatic for me. Consequently, I removed in_sink and in_graph methods and created the new CollectToGraph-trait which extends the standard Iterator. I think this is more accesible for Rust programmers as its more close to the standard API (cf. getting started example in lib.rs). I tried to implement something like impl FromIterator for Result<Graph> so user could just call collect()? on a TripleSource but this is vorbidden by Rust's orphan rule...

Conclusion

I'm totally aware that these are huge changes and that you might think that some of them are to much. Most of the changes I did are for sake of better ergonomics and a more idomatic Rust. I am open to disucussion and l will gladly change the PR if required.

MattesWhite commented 4 years ago

I see now that this is to much change at once. As a result, I will close this PR.

However, it is still a PoC for an alternative error-handling.

pchampin commented 4 years ago

Hi @MattesWhite, thanks very much for the work you put in this. And thanks for pushing me forward ;-)

This was indeed too big a change. Please make smaller PRs so that we can discuss each change individually.

Also, I have a few commits stalling on my machine (my bad, I should not do that either), which will make it very hard to merge such big changes. I'll push them now. In particular, have a look at the new triple::stream module, which should make it clearer why I am not using iterators.