KonradHoeffner / hdt

Library for the Header Dictionary Triples (HDT) compression file format for RDF data.
https://crates.io/crates/hdt
MIT License
19 stars 4 forks source link

JOSS Review #29

Closed lazear closed 1 year ago

lazear commented 1 year ago

Hi Konrad,

How do you feel about adding some examples to the paper? I think even just the code snippet (with some examples of triples that would be matched?) from the README would be useful to readers to quickly see how to use the library. This would also beef up the text section of the paper a little bit.

If possible, it would be nice to include this (or similar) as a cargo example (with dev dependencies) printing out some matched triples.

use hdt::{Hdt,HdtGraph};
use hdt::sophia::api::graph::Graph;
use hdt::sophia::api::term::{IriRef, SimpleTerm, matcher::Any};

let file = std::fs::File::open("dbpedia.hdt").expect("error opening file");
let hdt = Hdt::new(std::io::BufReader::new(file)).expect("error loading HDT");
let graph = HdtGraph::new(hdt);
let s = SimpleTerm::Iri(IriRef::new_unchecked("http://dbpedia.org/resource/Leipzig".into()));
let p = SimpleTerm::Iri(IriRef::new_unchecked("http://dbpedia.org/ontology/major".into()));
let majors = graph.triples_matching(Some(s),Some(p),Any);

// Maybe add a couple lines of expected output triples?

I think it might also be useful to summarize a few key benchmarking results/statistics in a table - it can be hard to interpret the graphs given the number of items be reproduced.

Last thing - I tried to clone and run your fork of the benchmarking repo, but ran into some compilation errors:

ierror[E0599]: no method named `triples_matching` found for struct `HdtGraph` in the current scope
   --> src/main.rs:154:16
    |
154 |         1 => g.triples_matching(Any, Some(rdf::type_), Some(dbo_person)),
    |                ^^^^^^^^^^^^^^^^ method not found in `HdtGraph`
    |
   ::: /home/ec2-user/.cargo/registry/src/github.com-1ecc6299db9ec823/sophia_api-0.8.0-alpha.0/src/graph.rs:153:8
    |
153 |     fn triples_matching<'s, S, P, O>(&'s self, sm: S, pm: P, om: O) -> GTripleSource<'s, Self>
    |        ---------------- the method is available for `HdtGraph` here
    |
    = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
    |
1   | use hdt::sophia::sophia_api::graph::Graph;
    |

error[E0599]: no method named `triples_matching` found for struct `HdtGraph` in the current scope
   --> src/main.rs:155:16
    |
155 |         2 => g.triples_matching(Some(dbr_vincent), Any, Any),
    |                ^^^^^^^^^^^^^^^^ method not found in `HdtGraph`
    |
   ::: /home/ec2-user/.cargo/registry/src/github.com-1ecc6299db9ec823/sophia_api-0.8.0-alpha.0/src/graph.rs:153:8
    |
153 |     fn triples_matching<'s, S, P, O>(&'s self, sm: S, pm: P, om: O) -> GTripleSource<'s, Self>
    |        ---------------- the method is available for `HdtGraph` here
    |
    = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
    |
1   | use hdt::sophia::sophia_api::graph::Graph;
    |

error[E0599]: no method named `triples_matching` found for struct `HdtGraph` in the current scope
   --> src/main.rs:156:16
    |
156 |         3 => g.triples_matching(Some(dbr_vincent), Some(rdf::type_), Any),
    |                ^^^^^^^^^^^^^^^^ method not found in `HdtGraph`
    |
   ::: /home/ec2-user/.cargo/registry/src/github.com-1ecc6299db9ec823/sophia_api-0.8.0-alpha.0/src/graph.rs:153:8
    |
153 |     fn triples_matching<'s, S, P, O>(&'s self, sm: S, pm: P, om: O) -> GTripleSource<'s, Self>
    |        ---------------- the method is available for `HdtGraph` here
    |
    = help: items from traits can only be used if the trait is in scope
help: the following trait is implemented but not in scope; perhaps add a `use` for it:
    |
1   | use hdt::sophia::sophia_api::graph::Graph;
    |
osorensen commented 1 year ago

Thanks for posting an issue here @lazear. I'll just mentioned the JOSS review repository to make it easier to keep track.

https://github.com/openjournals/joss-reviews/issues/5114

KonradHoeffner commented 1 year ago

Hi Michael, thank you for all the suggestions!

To keep the size on crates.io as small as possible, the library only contains a very small test HDT file is, so the DBpedia example data is not included, but I will create an example that queries the available data.

KonradHoeffner commented 1 year ago

All issues are addressed now so I will close this issue as completed. Feel free to reopen if I overlooked something.