vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.86k stars 606 forks source link

Streaming queries always return degraded true if you only have a single document in your index.... #26490

Closed jobergum closed 1 year ago

jobergum commented 1 year ago

Describe the bug When using Vespa with mode streaming, I always get a coverage degraded in the response, with "non-ideal-state": true. It also seems like metadata like nodes, results, and resultsFull is incomplete (0).

curl "$endpoint/search/?yql=select%20*%20from%20sources%20*%20where%20text%20contains%20%22april%22&streaming.userid=1&" -q -s |python3 -m json.tool
{
    "root": {
        "id": "toplevel",
        "relevance": 1.0,
        "fields": {
            "totalCount": 1
        },
        "coverage": {
            "coverage": 0,
            "documents": 1,
            "degraded": {
                "match-phase": false,
                "timeout": false,
                "adaptive-timeout": false,
                "non-ideal-state": true
            },
            "full": true,
            "nodes": 0,
            "results": 0,
            "resultsFull": 0
        },
        "children": [
            {
                "id": "id:doc:doc:n=1:1",
                "relevance": 0.22853769543241514,
                "source": "text.doc",
                "fields": {
                    "sddocname": "doc",
                    "documentid": "id:doc:doc:n=1:1",
                    "subject": "Instant Rebates on Leupold BX-4 Pro Guide & SX-4 Pro Guide Spotters!",
                    "text": "Through April 2nd, 2023, save $100 on select Leupold BX-4 Pro Guide HD Binoculars or save $200 on SX-4 Pro Guide HD Spotters!",
                    "embedding_vector": [
                        1.0,
                        2.0,
                        3.0,
                        4.0
                    ]
                }
            }
        ]
    }
}
baldersheim commented 1 year ago

Running a system test with streaming does not reproduce this issue. Coverage is reported as expected. "coverage"=>{"coverage"=>100, "documents"=>777, "full"=>true, "nodes"=>1, "results"=>1, "resultsFull"=>1}

baldersheim commented 1 year ago

Reproduced when there is only a single document in the index. {"root"=>{"id"=>"toplevel", "relevance"=>1.0, "fields"=>{"totalCount"=>1}, "coverage"=>{"coverage"=>0, "documents"=>1, "degraded"=>{"match-phase"=>false, "timeout"=>false, "adaptive-timeout"=>false, "non-ideal-state"=>true}, "full"=>true, "nodes"=>0, "results"=>0, "resultsFull"=>0}, "children"=>[{"id"=>"id:test:test:n=1234:01", "relevance"=>0.0008714596949891067, "source"=>"search.test", "fields"=>{"sddocname"=>"test", "b"=>"23.5", "d"=>"23.92", "documentid"=>"id:test:test:n=1234:01", "num"=>5.7, "str"=>"Here you can read the number 5.8 in a text field.", "arr"=>["five", "5.9", "dot nine"]}}]}}