oramasearch / orama

🌌 A complete search engine and RAG pipeline in your browser, server or edge network with support for full-text, vector, and hybrid search in less than 2kb.
https://docs.orama.com
Other
8.6k stars 289 forks source link

Feat: adds plugin for local embeddings generation at runtime #795

Closed micheleriva closed 3 weeks ago

micheleriva commented 3 weeks ago

To get started with Orama Plugin Embeddings, just install it with npm:

npm i @orama/plugin-embeddings

Important note: to use this plugin, you'll also need to install one of the following TensorflowJS backend:

For example, if you're running Orama on the browser, we highly recommend using @tensorflow/tfjs-backend-webgl:

npm i @tensorflow/tfjs-backend-webgl

If you're using Orama in Node.js, we recommend using @tensorflow/tfjs-node:

npm i @tensorflow/tfjs-node

Usage

import { create } from '@orama/orama'
import { pluginEmbeddings } from '@orama/plugin-embeddings'
import '@tensorflow/tfjs-node' // Or any other appropriate TensorflowJS backend

const plugin = await pluginEmbeddings({
  embeddings: {
    defaultProperty: 'embeddings', // Property used to store generated embeddings
    onInsert: {
      generate: true, // Generate embeddings at insert-time
      properties: ['description'], // properties to use for generating embeddings at insert time
      verbose: true,
    }
  }
})

const db = await create({
  schema: {
    description: 'string',
    embeddings: 'vector[512]' // Orama generates 512-dimensions vectors
  },
  plugins: [plugin]
})

Example usage at insert time:

await insert(db, {
  description: 'Classroom Headphones Bulk 5 Pack, Student On Ear Color Varieties'
})

await insert(db, {
  description: 'Kids Wired Headphones for School Students K-12'
})

await insert(db, {
  description: 'Kids Headphones Bulk 5-Pack for K-12 School'
})

await insert(db, {
  description: 'Bose QuietComfort Bluetooth Headphones'
})

Orama will automatically generate text embeddings and store them into the embeddings property.

Then, you can use the vector or hybrid setting to perform hybrid or vector search at runtime:

await search(db, {
  term: 'Headphones for 12th grade students',
  mode: 'vector'
})
vercel[bot] commented 3 weeks ago

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
orama-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Sep 26, 2024 0:31am