The HSN contains duplicate triples over multiple graphs. For example, person1 a hsn:OP has been created from multiple tables, so it exists in multiple graphs (2-4, in this case). This creates problems when querying because either graphs need to be specified, or it needs to be fixed after the query using distinct which makes the query inefficient.
At least all the time-invariant observations that can be directly linked to a person should be unique. It's possible the issue also exists in time-varying observations.
The easiest way to get all of them is probably by deduplicating all the nquads based on s-p-o.
The HSN contains duplicate triples over multiple graphs. For example,
person1 a hsn:OP
has been created from multiple tables, so it exists in multiple graphs (2-4, in this case). This creates problems when querying because either graphs need to be specified, or it needs to be fixed after the query usingdistinct
which makes the query inefficient.At least all the time-invariant observations that can be directly linked to a person should be unique. It's possible the issue also exists in time-varying observations.
The easiest way to get all of them is probably by deduplicating all the nquads based on s-p-o.