langchain-ai / langchainjs

🦜🔗 Build context-aware reasoning applications 🦜🔗
https://js.langchain.com/docs/
MIT License
12.53k stars 2.14k forks source link

Error [ERR_REQUIRE_ESM] when using HuggingFace Transformers #2992

Closed Facuiguacel closed 4 months ago

Facuiguacel commented 1 year ago

Hi, I'm getting this compilation error when trying to use HuggingFaceTransformersEmbeddings.

const transformers_1 = require("@xenova/transformers");
                       ^
Error [ERR_REQUIRE_ESM]: require() of ES Module .\node_modules\@xenova\transformers\src\transformers.js from .\node_modules\langchain\dist\embeddings\hf_transformers.cjs not supported.
Instead change the require of transformers.js in .\node_modules\langchain\dist\embeddings\hf_transformers.cjs to a dynamic import() which is available in all CommonJS modules.

@xenova/transformers is installed.

Any way around it?

This is my code that's using the module in a NestJs application:

import { Injectable } from '@nestjs/common'
import { HuggingFaceTransformersEmbeddings } from 'langchain/embeddings/hf_transformers'
import { TypeORMVectorStore } from 'langchain/vectorstores/typeorm'
import { DataSourceOptions } from 'typeorm'

@Injectable()
export class LangchainService {
  private async vectorStore(): Promise<TypeORMVectorStore> {
    const store = await TypeORMVectorStore.fromDataSource(
      new HuggingFaceTransformersEmbeddings({ modelName: 'Xenova/paraphrase-multilingual-mpnet-base-v2' }),
      {
        postgresConnectionOptions: {
          type: 'postgres',
          host: process.env.DB_HOST,
          port: parseInt(process.env.DB_PORT),
          username: process.env.DB_USERNAME,
          password: process.env.DB_PASSWORD,
          database: process.env.DB_NAME,
        } as DataSourceOptions,
        tableName: 'langchain_pg_embedding',
      },
    )
    store.ensureTableInDatabase()
    return store
  }

  public async textEmbedding(text: string, chatbotId: string, source: 'question' | 'answer') {
    const pgvectorStore = await this.vectorStore()
    await pgvectorStore.addDocuments([{ pageContent: text, metadata: { chatbotId, source } }])
  }

  public async searchSimilarity(text: string, chatbotId: string, source: 'question' | 'answer') {
    const pgvectorStore = await this.vectorStore()
    const results = await pgvectorStore.similaritySearch(text, 1, { chatbotId, source })
    return results
  }
}
dosubot[bot] commented 1 year ago

🤖

Hi,

This error is due to the fact that you're trying to import an ECMAScript module into a CommonJS module. Node.js treats files with a .js extension as CommonJS by default, and the require() function is not supported in ECMAScript modules.

A similar issue was addressed in the langchainjs repository here. The solution is to use dynamic imports, which are supported in both CommonJS and ECMAScript modules. Here's how you can modify your code:

export const getExamples = async (query: string, apiKey: string) => {
  const { MemoryVectorStore } = await import("langchain/vectorstores/memory");
  const { OpenAIEmbeddings } = await import("langchain/embeddings/openai");

  // Rest of your code
};

This approach uses the import() function which returns a promise. This function can be used in an async function to import the module and use it in your code. Please note that this method is asynchronous and the imported modules will not be available immediately. If you need to use the imported modules in a synchronous way, you might need to refactor your code to use promises or async/await.

Also, ensure that you're using a version of Node.js that supports ECMAScript modules. The current version of Node.js that langchainjs supports is 20 and above, and you're currently using Node.js version 20 as indicated by the .nvmrc file.

As for the version of @xenova/transformers you're using, it wasn't specified in your question. Could you please provide that information? It might be helpful in diagnosing the issue.

Let me know if this helps or if you have any other questions.

Sources


This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

jacoblee93 commented 1 year ago

Unfortunately no - it's to do with the way the dependent package is set up. Could you open an issue on them?

codenameakshay commented 11 months ago

Can you please take a look @hwchase17 ?

dosubot[bot] commented 8 months ago

Hi, @Facuiguacel,

I'm helping the langchainjs team manage our backlog and am marking this issue as stale. From what I understand, you encountered an [ERR_REQUIRE_ESM] error when using HuggingFace Transformers in a NestJs application. I provided a detailed response explaining the error and suggesting the use of dynamic imports, with a reference to a similar issue in another repository. jacoblee93 suggested opening an issue on the dependent package, and codenameakshay asked hwchase17 to take a look at the issue.

Could you please confirm if this issue is still relevant to the latest version of the langchainjs repository? If it is, please let the langchainjs team know by commenting on the issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days. Thank you!

Candoyeya commented 7 months ago

For those people, who still need to use: import { HuggingFaceTransformersEmbeddings } from 'langchain/embeddings/hf_transformers' or import {HuggingFaceTransformersEmbeddings} from '@langchain/community/embeddings/hf_transformers';

You can make the following patch, while the current problem is resolved

They just need to update line 82 of the hf_transformers.cjs file with this line const pipe = await (this.pipelinePromise ??= (await import("@xenova/transformers")).pipeline("feature-extraction", this.modelName));

Patch:

diff --git a/node_modules/@langchain/community/dist/embeddings/hf_transformers.cjs b/node_modules/@langchain/community/dist/embeddings/hf_transformers.cjs
index 0e3cf11..2edfd8a 100644
--- a/node_modules/@langchain/community/dist/embeddings/hf_transformers.cjs
+++ b/node_modules/@langchain/community/dist/embeddings/hf_transformers.cjs
@@ -79,7 +79,7 @@ class HuggingFaceTransformersEmbeddings extends embeddings_1.Embeddings {
         return data[0];
     }
     async runEmbedding(texts) {
-        const pipe = await (this.pipelinePromise ??= (0, transformers_1.pipeline)("feature-extraction", this.modelName));
+        const pipe = await (this.pipelinePromise ??= (await import("@xenova/transformers")).pipeline("feature-extraction", this.modelName));
         return this.caller.call(async () => {
             const output = await pipe(texts, { pooling: "mean", normalize: true });
             return output.tolist();

I hope it is useful to you 👍

alxpereira commented 7 months ago

you can differ the compilation as explain by the transformer.js guy in an obscure stackoverflow post

    const TransformersApi = Function('return import("@xenova/transformers")')();
    const { pipeline } = await TransformersApi;

only solution for a complete ESM ts support

l4b4r4b4b4 commented 7 months ago

How is this still an open issue? :thinking:

l4b4r4b4b4 commented 7 months ago

For those people, who still need to use: import { HuggingFaceTransformersEmbeddings } from 'langchain/embeddings/hf_transformers' or import {HuggingFaceTransformersEmbeddings} from '@langchain/community/embeddings/hf_transformers';

You can make the following patch, while the current problem is resolved

They just need to update line 82 of the hf_transformers.cjs file with this line const pipe = await (this.pipelinePromise ??= (await import("@xenova/transformers")).pipeline("feature-extraction", this.modelName));

Patch:

diff --git a/node_modules/@langchain/community/dist/embeddings/hf_transformers.cjs b/node_modules/@langchain/community/dist/embeddings/hf_transformers.cjs
index 0e3cf11..2edfd8a 100644
--- a/node_modules/@langchain/community/dist/embeddings/hf_transformers.cjs
+++ b/node_modules/@langchain/community/dist/embeddings/hf_transformers.cjs
@@ -79,7 +79,7 @@ class HuggingFaceTransformersEmbeddings extends embeddings_1.Embeddings {
         return data[0];
     }
     async runEmbedding(texts) {
-        const pipe = await (this.pipelinePromise ??= (0, transformers_1.pipeline)("feature-extraction", this.modelName));
+        const pipe = await (this.pipelinePromise ??= (await import("@xenova/transformers")).pipeline("feature-extraction", this.modelName));
         return this.caller.call(async () => {
             const output = await pipe(texts, { pooling: "mean", normalize: true });
             return output.tolist();

I hope it is useful to you 👍

why not make a PR? :raised_hands:

JackBlair87 commented 7 months ago

If anyone wanted the full modified file that worked for me:

"use strict";
Object.defineProperty(exports, "__esModule", { value: true });
exports.HuggingFaceTransformersEmbeddings = void 0;
const embeddings_1 = require("@langchain/core/embeddings");
const chunk_array_1 = require("@langchain/core/utils/chunk_array");
/**
 * @example
 * ```typescript
 * const model = new HuggingFaceTransformersEmbeddings({
 *   modelName: "Xenova/all-MiniLM-L6-v2",
 * });
 *
 * // Embed a single query
 * const res = await model.embedQuery(
 *   "What would be a good company name for a company that makes colorful socks?"
 * );
 * console.log({ res });
 *
 * // Embed multiple documents
 * const documentRes = await model.embedDocuments(["Hello world", "Bye bye"]);
 * console.log({ documentRes });
 * ```
 */

class HuggingFaceTransformersEmbeddings extends embeddings_1.Embeddings {
    constructor(fields) {
        super(fields ?? {});

        Object.defineProperty(this, "modelName", {
            enumerable: true,
            configurable: true,
            writable: true,
            value: "Xenova/all-MiniLM-L6-v2"
        });
        Object.defineProperty(this, "batchSize", {
            enumerable: true,
            configurable: true,
            writable: true,
            value: 512
        });
        Object.defineProperty(this, "stripNewLines", {
            enumerable: true,
            configurable: true,
            writable: true,
            value: true
        });
        Object.defineProperty(this, "timeout", {
            enumerable: true,
            configurable: true,
            writable: true,
            value: void 0
        });
        Object.defineProperty(this, "pipelinePromise", {
            enumerable: true,
            configurable: true,
            writable: true,
            value: void 0
        });
        this.modelName = fields?.modelName ?? this.modelName;
        this.stripNewLines = fields?.stripNewLines ?? this.stripNewLines;
        this.timeout = fields?.timeout;
    }
    async embedDocuments(texts) {
        const batches = (0, chunk_array_1.chunkArray)(this.stripNewLines ? texts.map((t) => t.replace(/\n/g, " ")) : texts, this.batchSize);
        const batchRequests = batches.map((batch) => this.runEmbedding(batch));
        const batchResponses = await Promise.all(batchRequests);
        const embeddings = [];
        for (let i = 0; i < batchResponses.length; i += 1) {
            const batchResponse = batchResponses[i];
            for (let j = 0; j < batchResponse.length; j += 1) {
                embeddings.push(batchResponse[j]);
            }
        }
        return embeddings;
    }
    async embedQuery(text) {
        const data = await this.runEmbedding([
            this.stripNewLines ? text.replace(/\n/g, " ") : text,
        ]);
        return data[0];
    }

    async runEmbedding(texts) {
        const pipe = await (this.pipelinePromise ??= (await import("@xenova/transformers")).pipeline("feature-extraction", this.modelName));
        return this.caller.call(async () => {
            const output = await pipe(texts, { pooling: "mean", normalize: true });
            return output.tolist();
        });
    }
}
exports.HuggingFaceTransformersEmbeddings = HuggingFaceTransformersEmbeddings;
jacoblee93 commented 4 months ago

Hi folks,

Sorry for missing this issue for so long - have updated @JonaMX's PR to use @Candoyeya's fix, have added it to the CJS export tests, and will aim to get this out today!

jacoblee93 commented 4 months ago

Live in community 0.2.11!

Bazoumana-Ouattara commented 1 week ago

Capture d'écran 2024-10-14 190254 lorsque j'exécute mon code avec "npm start" , mon navigateur s'ouvre avec ce message d'erreur. pourrier vous m'aider a comprendre ? je pensais que le problème c'était dans le dossier "node_modules" , je l'ai désinstallé et réinstallé mais le problème perciste.