run-llama / LlamaIndexTS

LlamaIndex in TypeScript
https://ts.llamaindex.ai
MIT License
1.77k stars 338 forks source link

Bundling issues with tiktoken (Error: Missing tiktoken_bg.wasm) #1127

Open marcusschiesser opened 1 month ago

marcusschiesser commented 1 month ago

I am opening this ticket to gather all issues related to bundling the WASM from https://github.com/dqbd/tiktoken:

  1. Using AWS Nodejs serverless project, see https://github.com/run-llama/LlamaIndexTS/issues/1110#issuecomment-2282274361
  2. Using NextJS deploying on Vercel, see https://github.com/run-llama/create-llama/issues/164 (was fixed by copying the WASM file; see https://github.com/run-llama/create-llama/pull/201/files)

If you encounter this issue, please post your setup and configuration here.

LeonhardZehetgruber commented 2 weeks ago

I am encountering this issue when trying to integrate llamaindex into my Obsidian plugin. The build output for the plugin is a bundled main.js file.

package.json (the relevant part):

{
    "type": "module",
    "scripts": {
        "dev": "node esbuild.config.mjs"
    },
    "dependencies": {
        "llamaindex": "0.5.20"
    }
}

esbuild.config.mjs:

import esbuild from "esbuild";
import process from "node:process";
import builtins from "builtin-modules";

const context = await esbuild.context({
    entryPoints: { main: "src/main.ts" },
    bundle: true,
    platform: "node",
    external: [
        "obsidian",
        "electron",
        "sharp",
        "onnxruntime-node",
        "./xhr-sync-worker.js",
        ...builtins],
    mainFields: ["browser", "module", "main"],
    conditions: ["browser"],
    format: "cjs",
    target: "es2022",
    logLevel: "info",
    treeShaking: true,
    outdir: "."
});

await context.rebuild();
process.exit(0);

tsconfig.json:

{
    "compilerOptions": {
        "baseUrl": "./src",
        "target": "es2022",
        "module": "ESNext",
        "moduleResolution": "bundler",
        "esModuleInterop": true,
        "skipLibCheck": true,
        "types": [
            "node",
            "jest"
        ],
        "lib": [
            "DOM",
            "ES5",
            "ES6",
            "ES7",
            "ES2021",
            "ES2022"
        ]
    },
    "include": [
        "**/*.ts"
    ]
}

If I now use the following in my main.ts:

import { HuggingFaceEmbedding, Settings } from 'llamaindex';

Settings.embedModel = new HuggingFaceEmbedding({
    modelType: 'nomic-ai/nomic-embed-text-v1.5',
    quantized: false
});

I get the error Error: Missing tiktoken_bg.wasm at node_modules/tiktoken/tiktoken.cjs in the developer console.