Open steveruizok opened 11 months ago
Good catch. I may break out the CLI piece to a separate package. I've started thinking about a number of changes for the next version of Vectra and will start a discussion next week. I'm probably going to put a wrap on the current bits and call it 1.0 and then start working on version 2.0. I want to change the file format to something more compact then the current JSON format for index.json and I'll probably move away from folders to a new <index>.vectra
file format. I've worked out how to make managing larger indexes really efficient.
The cool thing about Vectra 2's file format is that you'll be able to merge document indexes by simply concatenating .vectra files. That means you can create per document indexes and then merge all of the .vectra files for a given set of documents into a single .vectra file that lets you search over the set.
First off, great library! We're using it to do our "AI powered" docs search on tldraw.dev.
Some of the code in this library depends on
turndown
, which in turn depends onsloppy.js
. Certain "strict mode only" contexts, such as server-side routes on our Next.js app, will throw when this code and its sloppywith
usage is present. To fix this, I've had to patch out the exports of the WebFetcher and other items in the module'sindex.js
file.I'm not sure what the right fix is here, however you could consider separate export entry points, i.e.
import { WebFetcher } from "vectra/WebFetcher"
so that the extra exports are not pulled in when importing fromvectra
alone?