semsol / arc2

ARC RDF Classes for PHP
Other
332 stars 89 forks source link

DROP graph not supported? #62

Closed coreation closed 10 years ago

coreation commented 10 years ago

I currently inject triples into named graphs, which works just fine. However I seem to run into problems when I perform a "DROP GRAPH " query. Is this type of query supported out of the box (I don't see the drop statement in the store documentation), and if not supported is there a way to manage these named graphs, because if you can't delete them, it seems like a small mem-leak to me?

Thanks in advance.

bnowack commented 10 years ago

ARC doesn't support SPARQL 1.1, just a dialect called SPARQL+. The equivalent to DROP GRAPH is "DELETE FROM http://example.com/graph"

coreation commented 10 years ago

Hi @bnowack, thanks for the quick reply, github probably rendered it but the uri is probably meant to be between < and > symbols correct?

bnowack commented 10 years ago

yes, correct. I'll fix the comment escaping

coreation commented 10 years ago

Thanks! That answers all of my questions then.

coreation commented 10 years ago

Hm, apparently that doesn't answer all of my questions, I expected the drop or delete to drop my graph, + all of its triples as well. It doesn't seem to do that. Yes _g2t is cleared out, but the triples remain, how can I make sure all of the triples in the graph I want to delete are also deleted?

Edit: The _triples entries are also gone, so my understanding is that you keep all of the predicates, subjects, objects for optimization purposes (perhaps other triples might need them or are using them). My question thus changes to, is this a mem-leak or a contained optimization feature, meaning that if you see no triples is using a certain stored predicate, subject, or object that they are deleted from the back-end.

k00ni commented 7 years ago

I ran into the same problem as @coreation: removing the graph via

DELETE FROM <http://foobar/>

left the triples untouched and all tables except g2t keep their triple-data. Is this intentional or a bug?

I tried removing the triples using:

DELETE FROM <http://foobar/> { ?s ?p ?o . } WHERE { ?s ?p ?o . }

or

DELETE FROM <http://foobar/> { ?s ?p ?o . }

which also doesn't work.

coreation commented 7 years ago

I think this repository is not maintained anymore, only PR's are accepted from time to time I'm afraid.

k00ni commented 7 years ago

I implemented a solution in our RDF framework called Saft, which makes use of ARC2 as backend. If this is a bug, i could try to fix it in ARC2 directly.

bnowack commented 7 years ago

Hi, keeping the term hashes was intentional, so that ARC re-uses internal IDs after their first creation. IIRC, this had a speed benefit when a store frequently gets cleared and re-populated.

For clean DB tables, I used to do a SPOG export + import in a fresh store.

k00ni commented 7 years ago

For clean DB tables, I used to do a SPOG export + import in a fresh store.

Would you explain that further, please? How can i clean the tables when dropping a graph?

bnowack commented 7 years ago

Basically, you can't clean the tables through SPARQL+.

There is a $store->createBackup method, which lets you write a backup file to disk (or alternatively the SPARQL+ query DUMP which returns the backup data).

Then you can reset the store ($store->reset()) and re-import the backup file via a LOAD <path/to/backup-file.spog> query. (If I remember correctly, I haven't looked at the code for quite some time...)

The 2nd step ($store->reset()) cannot be done via the endpoint, and it is not graph-specific but clears all tables. In all other cases, keeping the term hashes is/was intentionally, although I don't fully remember the reasons ;-)