Search response returns incorrect number of documents

akolhun commented 2 weeks ago

Describe the bug Search response returns root.fields.totalCount = X, but in fact lesser number of documents is returned

To Reproduce Steps to reproduce the behavior:

launch vespa cluster with 12 pods (helm chart attached): helm create vespa -n vespa-privatemedia --create-namespace .
load vespa application package (attached)
load attached data.json via vespa-feeder cli tool: vespa-feeder data.json

execute a query as:

curl 'http://localhost:8080/search/' \
--header 'Content-Type: application/json' \
--data '{
"yql": "select * from mp_private_media where site_id contains '\''c5402062-bedf-4e3e-80ad-d668993ed9b2'\'' and state contains '\''trash'\''",
"hits": 100,
"offset": 0
}'

Response contains root.fields.totalCount=54, but in fact 38 docs get returned

Expected behavior Response should contain 54 docs, as root.fields.totalCount claims

Environment (please complete the following information):

OS: Amazon Linux
Infrastructure: Kubernetes
Versions: v1.24.17-eks

Vespa version 8.408.12

Additional context Note: the problem varies based on the number of nodes defined in content cluster. Looks like it's a distribution key releated issue

vap_privatemedia.zip vespa_privatemedia_helm.zip data.json.zip

hmusum commented 1 week ago

This could be due to timeout, see the doc on timeout and the further documentation this points to, e.g. soft timeout

See also documentation about summaries, especially the section on performance

An actual query response could also be helpful. In that case please include "trace.level": 4 in the query

hmusum commented 5 days ago

Noe feedback, closing

vespa-engine / vespa

Search response returns incorrect number of documents #32424