Closed 107dipan closed 7 months ago
From your description I'm assuming this is the behavior you would like to observe (please correct me if I'm wrong):
Unfortunately, this is not the behavior you will observe in practice, as Vespa is not a multi-version store. Visits with timestamp ranges return documents that were last modified within the given timestamp range. Deletion is considered a modification.
Some details as to why this is the case:
Vespa's internal data model is logically[^1] a mapping of any stored document $d$ from document ID ${\cal I}_d \mapsto {\langle T, {\cal S} \rangle}_d$ where $T_d$ is the wall-clock timestamp of the most recent mutation to that document and ${\cal S}_d$ is the current document state, which is either a set of populated document fields or a tombstone sentinel $\cal T$.
Since only the most recent mapping is retained, this means that deleting document $\cal D$ in this case transitions the internal state from ${\cal I_D} \mapsto \langle 100, {\cal S_D} \rangle$ to ${\cal I_D} \mapsto \langle 200, {\cal T} \rangle$. The knowledge of any prior version(s) is immediately garbage-collected from the system and can therefore not be returned by a timestamp range visit.
[^1]: in the real world with potentially inconsistent data across replicas, Vespa performs on-demand write-repair and read-repair to maintain the illusion of such a logical mapping.
Hi @vekterli, We are using a selection criteria which is a range query based on a timestamp that is added by our webserver. For now as a workaround to fetch the deleted documents we are using the selection criteria "not schemaName.uniqueId" where uniqueId is a id stamped to all documents by our webserver. From our understanding since vespa maintains the tombstones for 2weeks by default we should get all the documents that were deleted in that period.
When you refer to "deleted documents" do you mean the actual document contents (i.e. complete with field values), or simply the tombstones?
Documents that have been deleted cannot be retrieved (other than their ID in the form of a tombstone) even when visiting using a timestamp range that covers the original feed time.
Describe the bug Vespa visit not returning deleted documents when selection criteria is added.
To Reproduce Steps to reproduce the behavior:
Expected behavior Docs deleted are not returned.
Screenshots If applicable, add screenshots to help explain your problem.
Environment (please complete the following information):
Vespa version 8.294.50