pchampin / sophia_rs

Sophia: a Rust toolkit for RDF and Linked Data
Other
210 stars 23 forks source link

Implement `std::iter::Extend` for provided `Graph`/`Dataset` implementations #138

Closed damooo closed 9 months ago

damooo commented 11 months ago

std::iter::Extend allows to extend collections with iterators / streams that produce same type of items.

Though there are CollectableGraph and CollectableDataset, they won't work with methods provided by futures::stream::TryStreamExt etc to consume from non blocking streams.

Currently DatasetAsGraph, UnionGraph, FastGraph etc implementations are painful to construct from async streams. while HashSet/Vec are trivial as they implement extend.

pchampin commented 11 months ago

std::iter::Extend allows to extend collections with iterators / streams that produce same type of items.

There is an insert_all method in MutableGraph and MutableDataset that plays a similar role.

Some implementations of Graph and Dataset already implement Extend (namely those based on Vec and HashMap). It would be possible to make other implementations (e.g. inmem::FastGraph) implement Extend, but generally, insert_all is more flexible (it works with any impl of Triple).

Do you have a use-case where using Extend::extend would be superior to using Graph::insert_all?

Though there are CollectableGraph and CollectableDataset, they won't work with methods provided by futures::stream::TryStreamExt etc to consume from non blocking streams.

As discussed in #134, Sophia is currently focused on "sync" Rust. I suggest we continue discussing this on #134.

Currently DatasetAsGraph, UnionGraph, FastGraph etc implementations are painful to construct from async streams. while HashSet/Vec are trivial as they implement extend.

I don't get the link between async streams and the Extend trait (which is about sync iterators!).

Note that you should not have to create DatasetAsGraphs nor UnionGraphs "manually":

damooo commented 11 months ago

Do you have a use-case where using Extend::extend would be superior to using Graph::insert_all?

I don't get the link between async streams and the Extend trait (which is about sync iterators!).

TryStreamExt::try_collect

StreamExt::collect

damooo commented 11 months ago

DatasetAsGraphs are constructed via the Dataset::graph and Dataset::graph_mut methods

Yes, i tried them first. But the method signature require bounds of T: for<'x> Term<BorrowTerm<'x> = DTerm<'x, Self>> + 'static, for T in GraphName, which was not possible in my case with few generic interactions.

It gets sometimes complicated due to GATs and lack of functionality to constrain a GAT value, as in this post, and this rfc

damooo commented 11 months ago

Sorry for spamming this issue, it would also be great to have DatasetGraph to implement SetGraph when underlying Dataset implements SetDataset.

pchampin commented 10 months ago

Sorry for spamming this issue, it would also be great to have DatasetGraph to implement SetGraph when underlying Dataset implements SetDataset.

Very good point. Could you please open a dedicated issue for this? And/or even better, a PR ? :wink:

pchampin commented 10 months ago

DatasetAsGraphs are constructed via the Dataset::graph and Dataset::graph_mut methods

Yes, i tried them first. But the method signature require bounds of T: for<'x> Term<BorrowTerm<'x> = DTerm<'x, Self>> + 'static, for T in GraphName, which was not possible in my case with few generic interactions.

If you need a graph with a different type of Terms than the underlying dataset, then you need to copy the corresponding data in a separate graph (e.g. a Vec<[Term; 3]> or a FastGraph), you should not try using DatasetAsGraph for this.

But granted, this is not clear from the documentation of DatasetAsGraph, which should be improved in that line.

pchampin commented 10 months ago

Are you ok to close this issue (opening new ones for the most specific questions above, if needed)?

damooo commented 9 months ago

Sorry for late response. I am closing the issue.

And my laptop is of 4GB ram, and developing rust projects is not practical. I have to use free credits from github codespaces. Currently i am using them for Manas Soid Server. Hence cannot send a PR.