vespa-engine / vespa

AI + Data, online. https://vespa.ai
https://vespa.ai
Apache License 2.0
5.62k stars 589 forks source link

Namespace/doctype-ignorant visit in /document/v1 #24370

Closed kkraune closed 1 year ago

kkraune commented 1 year ago

Clients can use whatever namespace when feeding documents. To access documents using /document/v1, this namespace must be known.

This is most often the problem for new users / users working on a new installation with little prior knowledge

It is possible to dump documents, or IDs without knowing namespace or doctype, like

$ docker exec vespa /opt/vespa/bin/vespa-visit -i
id:mynamespace:music::love-id-here-to-stay (Last modified at 0)
id:mynamespace:music::a-head-full-of-dreams (Last modified at 0)
id:mynamespace:music::hardwired-to-self-destruct (Last modified at 0)

With this, I quickly learn both namespace and doctype

When querying, one can query whatever doctype in YQL, and namespace is not used in the query API

I suggest adding new features to document/v1, like a wildcard for the two, or a new path - so one can explore the data with no prior knowledge, like using vespa-visit.

vekterli commented 1 year ago

It's possible to visit all documents across all document types in a given cluster through the /document/v1/ path directly (i.e. no namespace/doc type), though this requires specifying the target content cluster via the cluster request parameter. Further constraints can be added via the selection parameter, which takes any valid document selection string (such as simply a document type).

Today's ordering /document/v1/<namespace>/<doctype>/ should ideally be the inverse, i.e. /document/v1/<doctype>/<namespace> since that allows for namespace-agnostic visiting for a given document type, which is far more common than the other way around. But this cannot readily be changed in V1. It's high up on the TODO list for Document V2.

kkraune commented 1 year ago

Right! it is even documented:

curl http://hostname:8080/document/v1/?cluster=mycluster

I will de-clutter the documentation at https://docs.vespa.ai/en/document-v1-api-guide.html, add some troubleshooting and better examples.

kkraune commented 1 year ago

doc improved in https://github.com/vespa-engine/documentation/pull/2325