langchain-ai / langchainjs

🦜🔗 Build context-aware reasoning applications 🦜🔗
https://js.langchain.com/docs/
MIT License
12.41k stars 2.1k forks source link

getEnvironmentVariable utility always return undefined on Cloudflare Workers #5303

Closed davkorss closed 1 month ago

davkorss commented 5 months ago

Checked other resources

Example Code

https://github.com/langchain-ai/langchainjs/blob/main/langchain-core/src/utils/env.ts#L77

export function getEnvironmentVariable(name: string): string | undefined {
  // Certain Deno setups will throw an error if you try to access environment variables
  // https://github.com/langchain-ai/langchainjs/issues/1412
  try {
    return typeof process !== "undefined"
      ? // eslint-disable-next-line no-process-env
        process.env?.[name]
      : undefined;
  } catch (e) {
    console.error(e);
    return undefined;
  }
}

Error Message and Stack Trace (if applicable)

No response

Description

Greetings!

In Cloudflare Workers, the use of the API process requires to be imported, and the compatibility flag “nodejs_compat” must be used.

https://developers.cloudflare.com/workers/runtime-apis/nodejs/process/

However, in the utilities the “node:process” module is not imported, so referencing the API process generates an error and undefined is returned from the catch. This problem doesn't allow LangSmith to be used from Cloudflare Workers. I've been trying several solutions since I don't think it's a good idea to solve the problem by importing the “node:process” module, since that would force everyone to add the compatibility flag “nodejs_compat” in their workers. Maybe the model constructor could receive some tracing props, in the same way as the verbose property, which can also be obtained from environment variables, something like that:

const llm = new ChatOpenAI({
  tracing: true,
  tracingProjectName: this.env.LANGCHAIN_PROJECT,
  tracingApiKey: this.env.LANGCHAIN_API_KEY,
  tracingApiUrl: this.env.LANGCHAIN_ENDPOINT,
  streaming: false,
  openAIApiKey: this.env.OPENAI_API_KEY,
  modelName: model,
  callbacks: [callbackHandlers],
  verbose: true,
  temperature: 0,
  configuration: {
    baseURL: this.env.CLOUDFLARE_AI_GATEWAY_OPENAI_ENDPOINT,
  },
});

If you have any idea how we can solve this, I can create the feature!

System Info

platform Cloudflare Workers

dosubot[bot] commented 5 months ago

To address the issue with getEnvironmentVariable in Cloudflare Workers without requiring the nodejs_compat flag, you can adapt the utility to leverage the globalThis object for environment variable access. This approach avoids the need for Node.js compatibility and directly utilizes Cloudflare Workers' global scope. Here's how you can modify the function:

export function getEnvironmentVariable(name: string): string | undefined {
  try {
    // Attempt to access the environment variable from the global scope
    if (typeof globalThis[name] !== "undefined") {
      return globalThis[name];
    } else if (typeof process !== "undefined") {
      // Fallback for Node.js environments
      return process.env?.[name];
    } else {
      return undefined;
    }
  } catch (e) {
    return undefined;
  }
}

This code first attempts to retrieve the environment variable from globalThis, suitable for Cloudflare Workers. If the variable isn't found and the environment is Node.js (where process is defined), it falls back to using process.env. This ensures compatibility across different environments without special flags.

Regarding your suggestion to modify the model constructor for tracing properties, it's indeed a viable approach for passing configuration directly, enhancing flexibility and security by decoupling API keys and other sensitive information from your codebase. Implementing such a feature would involve extending the constructor's parameters to accept these additional properties, ensuring they are properly handled within the library.

Before deploying these changes, ensure thorough testing in both Cloudflare Workers and Node.js environments to confirm that the modified getEnvironmentVariable function works as expected and that any changes to the constructor do not introduce regressions.

#### Details **Sources** - [langchain-core/src/utils/env.ts](https://github.com/langchain-ai/langchainjs/blob/main/langchain-core/src/utils/env.ts) - [environment_tests/test-exports-tsc/main.ts](https://github.com/langchain-ai/langchainjs/blob/main/environment_tests/test-exports-tsc/main.ts) - [libs/langchain-cloudflare/package.json](https://github.com/langchain-ai/langchainjs/blob/main/libs/langchain-cloudflare/package.json) - [libs/langchain-cloudflare/src/llms.ts](https://github.com/langchain-ai/langchainjs/blob/main/libs/langchain-cloudflare/src/llms.ts) - [libs/langchain-community/src/llms/cloudflare_workersai.ts](https://github.com/langchain-ai/langchainjs/blob/main/libs/langchain-community/src/llms/cloudflare_workersai.ts) - [libs/langchain-cloudflare/src/chat_models.ts](https://github.com/langchain-ai/langchainjs/blob/main/libs/langchain-cloudflare/src/chat_models.ts) - [examples/src/agents/chat_convo_with_tracing.ts](https://github.com/langchain-ai/langchainjs/blob/main/examples/src/agents/chat_convo_with_tracing.ts) - [examples/src/models/chat/integration_openai_custom_base.ts](https://github.com/langchain-ai/langchainjs/blob/main/examples/src/models/chat/integration_openai_custom_base.ts) - [examples/src/models/chat/integration_openai.ts](https://github.com/langchain-ai/langchainjs/blob/main/examples/src/models/chat/integration_openai.ts) - [CONTRIBUTING.md](https://github.com/langchain-ai/langchainjs/blob/main/CONTRIBUTING.md) - [.github/workflows/unit-tests-langchain.yml](https://github.com/langchain-ai/langchainjs/blob/main/.github/workflows/unit-tests-langchain.yml)
**About Dosu** This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

To continue the conversation, mention @dosu.

jacoblee93 commented 4 months ago

Hmm yeah that's not great - as a stopgap you can manually pass in a tracer like this:

https://docs.smith.langchain.com/how_to_guides/tracing/trace_with_langchain#trace-selectively

Will have a think to see how to do this more seamlessly CC @craigsdennis

jacoblee93 commented 4 months ago

The other thing you could try (suggested by @efriis) is to manually set process.env from the passed in environment:

https://developers.cloudflare.com/workers/runtime-apis/nodejs/process/#processenv

davkorss commented 4 months ago

I understand that manually defining process.env will not change the fact that the getEnvironmentVariable utility will continue to return undefined from the catch, since in Cloudflare Workers the use of the process API must be explicitly imported import * as process from 'node:process'; as shown in the docs. I will try to manually pass the tracer, thank you very much for your time!

jacoblee93 commented 4 months ago

Would this work?

import * as process from 'node:process';
process.env = env;
// tracing code
davkorss commented 4 months ago

Implementing that piece of code in the utility https://github.com/langchain-ai/langchainjs/blob/main/langchain-core/src/utils/env.ts would solve the problem. However, if I implement it myself in my project, it wouldn't solve anything, since using the getEnvironmentVariable function inside langchain, it would still have the same wrong behavior returning always undefined. In short, it doesn't matter if I define process.env in my project, since langchain internally won't be able to retrieve its values when running in Cloudflare Workers.

jeasonnow commented 4 months ago

To address this issue, we cannot simply adjust getEnvironmentVariable. Instead, we need to follow these steps:

(The externalEnvs parameter is one that I randomly named, but if possible, it could be replaced with a better name.)

davkorss commented 4 months ago

I like it, it looks much cleaner than what I suggested before, thank you very much!