antoniogarrote / rdfstore-js

JS RDF store with SPARQL support
MIT License
564 stars 109 forks source link

Question about deleting #96

Open johlrogge opened 9 years ago

johlrogge commented 9 years ago

I have been trying my very limited sparql skills on a persistent store with no luck.

I have a graph of owl:thing that can be myschema:containedIn owl:thing. Most things have foaf:depiction foaf:image.

My problem now is that I want to delete thing A and all triples related to A in any direction.

I am not sure what is implemented and how I would express such a thing.

I have tried the following:

'PREFIX h: <http://dishoarder.com/hoard/1.0> \
DELETE {<'+context.uri+'> h:containedIn ?s} \
WHERE { \
  <'+context.uri+'> h:containedIn ?s . \
}'

And all I manage to do with that is delete everything in my graph :)

I'm sure it is my DELETE WHERE that is messed up somehow but it would be nice to be pointed in the right direction. Can I use the high level API delete somehow here? What is the best way to approach this?

johlrogge commented 9 years ago

I tried with a construct query using the resulting graph in the store.delete method. I inspected the graph to see that it contained the triples I expected it to and it did. However when passing the graph to delete all triples where removed.

I then tried to create the storage with persistent:false and then the delete works. It seems like delete does not work with the persistent store right now.

johlrogge commented 9 years ago

Looking in chrome I can see that the store is not emptied but it seems that no nodes can be reached from my root node. The IndexDB-store does not behave the same way as the inMemoryStore for me...

johlrogge commented 9 years ago

I have made a small example that shows one of the problems I've been having:

window.onload = function main() {
    rdfstore.create({persistent:true, name:'test',clear:true}, function(err, store) {
        store.registerDefaultProfileNamespaces();
store.setPrefix('h', 'http://example.com/hoard/1.0/');

        var insert = 'INSERT DATA { \
<http://example.com/things/root> a owl:thing; \
foaf:depiction <http://example.com/images/someimage.jpg> . \
<http://example.com/images/someimage.jpg> foaf:thumbnail <http://example.com/images/thumbs/someimage.jpg> . \
}';

        var insert2 = 'INSERT DATA { \
<http://example.com/things/child1> a owl:thing; \
foaf:depiction <http://example.com/images/someimage2.jpg>; \
h:containedIn <http://example.com/root> \
. \
<http://example.com/images/someimage2.jpg> foaf:thumbnail <http://example.com/images/thumbs/someimage2.jpg> . \
}';

        var construct = 'CONSTRUCT {<http://example.com/things/child1> ?p ?o} \
WHERE {<http://example.com/things/child1> ?p ?o}';

        console.log("insert ", insert);
        store.execute(insert, function(err, result){
            store.execute(insert2, function(e, r){
                if(err) {
                    console.log("error! ", err);
                }
                console.log("Seems to work...");
                store.execute(construct, function(e, graph){
                    console.log("graph ", graph);
                    store.delete(graph, function(err){
                        console.log("err? ", err);

                    store.execute('SELECT ?s WHERE  {?s ?p ?o}',
                                  function(err, results){
                                      results.forEach(function(triple){
                                          console.log("triple ", triple);
                                      });
                                  });
                    });
                });
            });
        });
    });
};

It gives me

Uncaught TypeError: Cannot read property 'token' of null(anonymous function) @ rdf_store.js:24534(anonymous function) @ rdf_store.js:23294(anonymous function) @ rdf_store.js:29571(anonymous function) @ rdf_store.js:29553(anonymous function) @ rdf_store.js:29534(anonymous function) @ rdf_store.js:29550(anonymous function) @ rdf_store.js:2956745.Lexicon.retrieve.async.seq.request.onsuccess @ rdf_store.js:23282

On the last select if I do store.delete(graph, ...) before that.

The constructed graph contains 3 triples as expected [child1 contained in root], [child1 depiction someimage2], [child1 a owl:thing]

I'll be digging some more. Just wanted to post an update with a reproducable error. Perhaps I'm doing something that is obviously wrong in which case I would appreciate a pointer.

/J

johlrogge commented 9 years ago

I'm not sure about this but it seem like shouldIndex should be false when calling normalizeQuad from QueryEngine.prototype._executeQuadDelete. It also seems like the indexDB store unregisters URLs prematurely from the Lexicon but not sure about that yet.

johlrogge commented 9 years ago

A stab in the dark is that the counter attribute on the URL is ignored in (or before) unregister.

johlrogge commented 9 years ago

Now uncommented var request = transaction.objectStore("uris").delete(oid); in Lexicon.prototype._unregisterTerm` and my code works (of course no URI's are deleted which is a flaw).

I guess what should happen is that the count-attribute should be checked and if the number of references are count > 1 then the uri should be updated with count = count -1. Otherwise the uri should be deleted.

Does that make sense?

johlrogge commented 9 years ago

ping @antoniogarrote

antoniogarrote commented 9 years ago

Hi @johlrogge, I've been aways for some days. I'll look at this asap. Sorry for my late reply.

johlrogge commented 9 years ago

No worries, I have experimented with my idea in my last comment. It seems to do the trick but I'm not totally sure that the counter-field is always correctly incremented. I'm trying to verify this. I could push my changes to my fork tonight so that you can have a look at them and see if they look correct.

johlrogge commented 9 years ago

I pushed my fixes (work in progress) https://github.com/johlrogge/rdfstore-js/commit/7e423fa2970cfdaa570d0517b797d12c71661d04

johlrogge commented 9 years ago

@antoniogarrote I wonder if I have misounderstood the purpose of the counter attribute in persistent_lexicon.js. I thought it was intended for reference counting in order to be able to determine if a URI is referenced in a graph. To debug I wrote a function that reads all the counter values and also summarizes how many refereferences there is to an ID in the persistent quad backend and the counter is always higher.

I have an idea to do the counting empirically instead and let the indexeddb actually count the occurences remaining occurences after removing a quad and if the count yields zero for a URI-id: remove it too. The downside is that this would require 4 new indices S, P, O and G (I think) but it would be more robust way to clean the lexicon (I think).

Any obvious flaws in my thinking?