Open RihanArfan opened 3 months ago
Thank your for creating the issue @RihanArfan
I think hubVectorize()
makes the more sense, feel free to open a PR to start the work 😊
While implementing it, I've ran into things which require big decisions to be made. Also moving this here instead of the PR description so it's clearer.
--remote
is necessary to use Vectorize via the deployed application. See https://github.com/cloudflare/workers-sdk/issues/4360.hubAI()
.nuxthub vectorize dev <binding> <create/list/delete/reset>
nuxthub vectorize dev products list
(shows all development indexes for products binding, including who made and optional description)nuxthub vectorize dev products use asdfg
Alternatively we just wait for Cloudflare to support local Vectorize bindings in wrangler to save massive complexity and overengineering 😄
nuxthub vectorize reset-dev products
(if one local-dev index exists per binding)nuxthub vectorize dev products reset
(if multiple local dev indexes possible)Wrangler would probably have a way to do this once local development is supported.
How Vectorize indexes should be managed using NuxtHub across different environments, etc.. Unlike other features like databases, a vectorize index needs options provided during creation based on the text embeddings model a user plans to use. Vectorize indexes also require specifying metadata indexes upfront if you want to use metadata filtering.
Any of these options would be used using hubVectorize(<binding>)
:
const vectorize = hubVectorize("products")
Here are some different options I've thought of to handle it.
pnpx wrangler vectorize create foo-ecommerce-products --dimensions=768 --metric=cosine
pnpx wrangler vectorize create foo-ecommerce-products-preview --dimensions=768 --metric=cosine
pnpx wrangler vectorize create foo-ecommerce-products-development --dimensions=768 --metric=cosine
export default defineNuxtConfig({
hub: {
// user needs to create metadata indexes via cli
vectorize: {
products: {
production: 'foo-ecommerce-products',
preview: 'foo-ecommerce-products-preview',
development: 'foo-ecommerce-products-development'
},
reviews: {
production: 'your-vectorize-id',
preview: 'your-vectorize-id',
development: 'your-vectorize-id'
}
},
}
})
Specifying index details (dimensions, metric) via nuxt.config.ts
. This approach needs extending to add metadata indexes, which are necessary to filter vectors via metadata.
vectorize: {
// nuxthub handled creation of the index across environments
products: {
metric: 'cosine',
dimensions: 768,
}
// use own vectorize indexes
reviews: {
production: 'your-vectorize-id',
preview: 'your-vectorize-id',
development: 'your-vectorize-id'
}
}
DX might be confusing as changing the config probably shouldn't result in automatically recreating production index to prevent accidental data loss. A potential fix is keeping the old Vectorize index but simply pointing the binding to a new index.
Create, reset and delete indexes via a NuxtHub CLI and/or dashboard. All handled via CLI and backend rather than nuxt.config.ts
. On start of dev server, Nuxt checks what indexes are available. This approach allows manually managing what indexes exist on each environment, including using existing indexes.
export default defineNuxtConfig({
hub: {
database: true,
vectorize: true,
},
});
$ # vectorize specific - future multi-bindings could have individual things
$ nuxthub vectorize create products --dimensions=768 --metric=cosine
# Done! Binding: PRODUCTS Index: foo-ecommerce-products, foo-ecommerce-products-preview, foo-ecommerce-local
# Use via `useVectorize("products")`
$ nuxthub vectorize list
# Vectorize indexes associated with "foo-ecommerce":
# [PRODUCTS]: # hubVectorize('products')
# foo-ecommerce-products | dimensions: 768 | metrics: cosine
# foo-ecommerce-products-preview | dimensions: 768 | metrics: cosine
# foo-ecommerce-products-local | dimensions: 768 | metrics: cosine
#
# [REVIEWS]:
# foo-ecommerce-products-local | dimensions: 768 | metrics: cosine
#
# [KNOWLEDGEBASE]:
# support-system-articles | dimensions: 768 | metrics: cosine
#
# ----------
# Create new index:
# nuxthub vectorize create <name> [--dimensions=<int>] [--metric=<string>] [--environments=<string=all>]
#
# Link an existing index:
# nuxthub vectorize link-existing-index support --index=support-system-articles-preview --environments=preview#
#
# Create a metadata index for an index:
# nuxthub vectorize create-metadata-index products --environments=all --property-name=streaming_platform --type=string
#
# Recreate local-development index (other environments would need deleting and recreating explicitly)
# nuxthub vectorize reset-dev products
Is your feature request related to a problem? Please describe. I'd like to use Cloudflare Vectorize (database for storing vectors) alongside Workers AI (#173).
Describe the solution you'd like Vectorize is similar to D1 where it's a database. It's implementation would look similar to D1 in Nuxt Hub.
Describe alternatives you've considered Manually adding a binding and directly using the API (#113)
Additional context
Happy to contribute a simple PR into @nuxt-hub/core to add
hubVectorize()
now but it wouldn't include proxying remote or devtools viewer though.Also, should it be called
hubVectorize()
or should it have a different name likehubVectorDatabase()
?