dbpedia / dbpedia-live-mirror

Keeps a mirror of DBpedia live in sync
GNU General Public License v3.0
26 stars 8 forks source link

Difference between REINSERTED and ADD #13

Closed rubberviscous closed 6 years ago

rubberviscous commented 6 years ago

Hi,

What is the difference between the "reinsert" and "add" operations in the syncing process. I looked at the codebase, in particular the ChangesetExecutor.java file and the two seem to perform the same SPARQL 'INSERT DATA INTO' operation. My assumption is that 'reinserted' updates a triple statement. So if the fact :hasAge "12" exists, a reinserted file with the statement :hasAge "15" will update the literal value of property :hasAge to "15. The reason I'm asking is because I'm trying to repurpose this code to support SPARQL updates for my own personal Virtuoso server.

if (changeset.triplesReinserted() > 0) {
            boolean status_a = executeAction(changeset.getReinserted(), Action.ADD);
            logger.info("Patch " + changeset.getId() + " REINSERTED " + changeset.triplesReinserted() + " resources");
            status = status && status_a;
        }
if (changeset.triplesAdded() > 0) {
            boolean status_a = executeAction(changeset.getAdditions(), Action.ADD);
            logger.info("Patch " + changeset.getId() + " ADDED " + changeset.triplesAdded() + " triples");
            status = status && status_a;
        }

Can someone clarify this whether the two are indeed different operations?

jimkont commented 6 years ago

In the past we were experiencing some bugs (code, VOS, network, etc) in the code that, when restarting the application were leaving the database in an inconsistent state. for this reason, besides the normal diff (deletions/additions) we have an optional clean update operation that is performed when a resource is updated for more than X (i.e. 5) times. in that case, besides the normal diff, there are two additional steps:

the update frequency and data are handled by the Live framework and this is where we execute these updates. For other applications that do not need this logic, these triples can be ignored

jimkont commented 6 years ago

Is this explanation sufficient? if so, can we close the issue

rubberviscous commented 6 years ago

Thank you for the explanation Dimitris! Can you point me to the file(s) where the optional clean update operation when resource is updated more than X times?

jimkont commented 6 years ago

If you look for example at this changeset list: http://live.dbpedia.org/changesets/2018/03/11/17/ the files with suffix clear and reinserted are the ones I the code you refer to handles.

On Sun, Mar 11, 2018, 16:56 Michael notifications@github.com wrote:

Thank you for the explanation Dimitris! Can you point me to the file(s) where the optional clean update operation when resource is updated more than X times?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/dbpedia/dbpedia-live-mirror/issues/13#issuecomment-372121477, or mute the thread https://github.com/notifications/unsubscribe-auth/AA_8HREHwYFJEYGQNNlu--rnEe4NQeWXks5tdTsqgaJpZM4ShJtg .

rubberviscous commented 6 years ago

Thanks!