bbcarchdev / spindle

RES Linked Open Data aggregation engine
https://bbcarchdev.github.io/spindle/
Apache License 2.0
2 stars 1 forks source link

Will proxy state changes cause infinite processing loops? #74

Closed simeonvandersteen closed 7 years ago

simeonvandersteen commented 8 years ago

Splitting the workflow as described in #73 seems to work. At this point I'm running into a more design question. If processing proxies can get other proxies into a "dirty" state because they are related to each other, how can we make sure twine doesn't end up processing in a never ending loop? If there's no built in limit of some sort, then theoretically you would only need two triples in a published dataset to get twine stuck in a loop. If there is a limit, how do you determine its value?

nevali commented 8 years ago

It's impossible to determine in advance without building a complete in-memory graph of the entities, but a mitigating factor is the fact that it's impossible for one proxy update to trigger a complete update of another — this is why there are different trigger types.

In theory it may be possible to construct data which causes this to happen (e.g., two items which claim to be licence documents for each other, or something of that ilk), but it'd take some effort. It's worth investigating further to determine whether there's a way of detecting that it's happened (e.g., tracking that a proxy is being updated especially frequently)