vespa-engine / pyvespa

Python API for https://vespa.ai, the open big data serving engine
https://pyvespa.readthedocs.io/
Apache License 2.0
79 stars 23 forks source link

Generate application packages from existing Vespa applications #754

Open Alexander-Mark opened 2 months ago

Alexander-Mark commented 2 months ago

Hi there

As far as I know, application packages are only available to a user if they created it. Is there a way to reflect a package object from an existing Vespa app?

This would be useful to integrate with:

  1. Vespa apps that were not created with pyvespa
  2. Vespa apps that were created by someone else with pyvespa

For example:

# Establish a connection with an existing Vespa app that I didn't create
my_app = Vespa(
    url="https://token-endpoint..z.vespa-app.cloud",
    vespa_cloud_secret_token="vespa_cloud_secret_token"
)

# Dynamically read schemas, docs, types, rank profiles etc from an auto mapped package
my_package = my_app.application_package

Otherwise if there's a way to do this already, please let me know.

bratseth commented 2 months ago

For self-hosted instances? If so you can dump the (single) application package deployed using the deploy/v2 API: https://docs.vespa.ai/en/reference/deploy-rest-api-v2.html#content-get

Alexander-Mark commented 2 months ago

For self-hosted instances? If so you can dump the (single) application package deployed using the deploy/v2 API: https://docs.vespa.ai/en/reference/deploy-rest-api-v2.html#content-get

Ok yes that would work for some of our cases. But we would also like to integrate with Vespa apps (Vespa cloud and remote hosted) where we have limited data privileges and no control plane access. We can of course actually look at the schema and services files and manually map custom python objects etc. But being able to generate an app package would be helpful as a Python Vespa ORM.

If thats not a planned feature, then you can probably just close this issue. Thanks

bratseth commented 2 months ago

You'll need control plane access to get this information, anything else would be a security breach.

Given that, I think the deploy API is a complete solution for self-hosted. For cloud we don't have a public API for this yet, but it's on our roadmap.

kkraune commented 2 months ago

Hi @Alexander-Mark : what you describe is a great idea! It would be awesome to be able to (de)serialize and application package to/from python objects

It is however a lot of work, as pyvespa is not feature complete. It does not have all of Vespa's features, so we cannot deserialize an app package into python objects.

We are keeping this on the backlog for later - thanks for submitting!

Alexander-Mark commented 1 week ago

Hi @kkraune thanks for the update -- that makes sense. I'll watch out for any updates thanks!