bbcarchdev / spindle

RES Linked Open Data aggregation engine
https://bbcarchdev.github.io/spindle/
Apache License 2.0
2 stars 1 forks source link

source cache is sometimes used when it shouldn't #86

Closed simeonvandersteen closed 8 years ago

simeonvandersteen commented 8 years ago

When re-ingesting data, 0-flags are sometimes updated to non-0-flags (via triggers that were already present for the data) so that the cache is used (see spindle_source_fetch_entry in generate/source.c) instead of the updated data.

The cache should actually be invalidated when necessary (by spindle-correlate for example) and then be used no matter what flag is set. If the cache turns out to be not available it can always be refreshed from the triple store.

nevali commented 8 years ago

zero-flags should never ever be updated to non-zero flags — that's a bug AFAIK.

simeonvandersteen commented 8 years ago

Everything is added as a zero-flags when the proxies are created (in the state table that is, not the triggers table), so if they shouldn't be updated there to a non-zero flag, that would mean partial updates can't exist.

nevali commented 8 years ago

well, yes, and no — sorry, 'never' wasn't accurate on my part.

it's added a zero-flag when correlate adds or changes it (because it's nearly impossible to determine what a partial update should be), but it's added with a status of DIRTY — it's then processed and marked as COMPLETE. a partial trigger would then set the flag bits and mark it back as DIRTY.

however, if the status is already DIRTY and flags is zero, the update shouldn't ever morph into a partial update via setting it to a nonzero flags value.

simeonvandersteen commented 8 years ago

Why do we use the zero flag in the db in the first place? If that means a full update it should be -1, whatever you do to the flag bits would then never scope down the update..

cgueret commented 8 years ago

Fixed with https://github.com/bbcarchdev/spindle/commit/b306f3fbabf6471eece417086e690e9ae7f50376

cgueret commented 8 years ago

Final fix https://github.com/bbcarchdev/spindle/commit/529c0bed604984991ad954fbfa56fc65bc6337ad

cgueret commented 8 years ago

There is a regression with that fix causing all the updates to be full. As the test does not update the flag except for cases <>0, and because entries are by default at 0, this column does not get updated