`maxRetries` not properly passed to `ChatOpenAI` in tracing

lukasugar commented 5 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain.js documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain.js rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

I have this code for creating a chain with ChatOpenAI client:

const chain  = RunnableSequence.from([
            chatPrompt, // prompt defined above
            new ChatOpenAI({
              modelName: 'gpt-4-0125-preview',
              temperature: 0.1,
              maxRetries: 3,
              modelKwargs: { response_format: { type: 'json_object' } },
            }),
            suggestionsParser, // some parser defined above
          ]);

I invoke the chain like this:

const response = await chain.invoke({ document_to_review: docContent }, { tags: ['suggestions'] });

Error Message and Stack Trace (if applicable)

No response

Description

The openai call returned an error but no retries were made. OpenAI error:

400 You requested a model that is not compatible with this engine. Please contact us through our help center at help.openai.com for further questions.

Error: 400 You requested a model that is not compatible with this engine. Please contact us through our help center at help.openai.com for further questions.
    at APIError.generate (/app/node_modules/.pnpm/openai@4.32.2/node_modules/openai/error.js:44:20)
    at OpenAI.makeStatusError (/app/node_modules/.pnpm/openai@4.32.2/node_modules/openai/core.js:263:33)
    at OpenAI.makeRequest (/app/node_modules/.pnpm/openai@4.32.2/node_modules/openai/core.js:306:30)
    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
    at async /app/node_modules/.pnpm/@langchain+openai@0.0.26/node_modules/@langchain/openai/dist/chat_models.cjs:684:29
    at async RetryOperation._fn (/app/node_modules/.pnpm/p-retry@4.6.2/node_modules/p-retry/index.js:50:12)

The langsmith logs show that retries parameter is present in the metadata passed

{
  "id": [
    "langchain",
    "chat_models",
    "openai",
    "ChatOpenAI"
  ],
  "kwargs": {
    "max_retries": 3,
    "model": "gpt-4-0125-preview",
    "model_kwargs": {
      "response_format": {
        "type": "json_object"
      }
    },
    "openai_api_key": {
      "id": [
        "OPENAI_API_KEY"
      ],
      "lc": 1,
      "type": "secret"
    },
    "temperature": 0.1
  },
  "lc": 1,
  "type": "constructor"
}

But when looking at invocation metadata in LangSmith, the retry parameter isn't present:

How to properly pass maxRetries parameter to ChatOpenAI? I want to specify a specific number, but does not specifying it at all really default to a 6 time retry with exponential backoff?

System Info

Running on an ec2 instance with linux

Langchain versions:

"@langchain/core": "0.1.53",
"@langchain/openai": "^0.0.26",
"langchain": "^0.1.31",

dosubot[bot] commented 5 months ago

To address the issue with the maxRetries parameter not being properly passed to ChatOpenAI, it's important to note that the parameter needs to be dynamically included in the client configuration. The current setup might have maxRetries hardcoded to 0 in the client configuration, which would prevent any retry attempts from being made.

To ensure maxRetries is utilized as intended, you should:

Verify that maxRetries is part of the ChatOpenAI configuration passed to the OpenAI client. If it's not, you'll need to add it.
Adjust the client configuration logic to use the maxRetries value from the ChatOpenAI configuration instead of a hardcoded value.

This approach will enable the retry mechanism to function according to the specified maxRetries value. The default behavior without specifying maxRetries does not include a retry mechanism, contrary to a 6-time retry with exponential backoff.

Implementing or adjusting the retry logic to respect the maxRetries parameter will ensure that your requests to OpenAI have the desired level of resilience against failures.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

lukasugar commented 5 months ago

I'm passing maxRetries exactly as in the docs: https://js.langchain.com/docs/modules/model_io/chat/dealing_with_api_errors

lukasugar commented 5 months ago

Should this line be changed from 0 to 6 to match the documentation: https://github.com/langchain-ai/langchainjs/blob/main/libs/langchain-openai/src/chat_models.ts#L916

const endpoint = getEndpoint(openAIEndpointConfig);
      const params = {
        ...this.clientConfig,
        baseURL: endpoint,
        timeout: this.timeout,
        maxRetries: 0, // change to 6
      };

rahilvora commented 5 months ago

@lukasugar Thanks for reporting this.

@jacoblee93 can you help me with the review?

lukasugar commented 5 months ago

@rahilvora thanks you for picking it up. I hope this fix gets shipped soon :) I hope @jacoblee93 will have time to review the PR soon, in the meantime I added a small comment there :)

jacoblee93 commented 5 months ago

Hey @lukasugar! We only retry certain types of errors - things like 400s will not be retried (it wouldn't make sense to retry something like You requested a model that is not compatible with this engine. Please contact us through our help center at help.openai.com for further questions. because it would never succeed).

We could look into adding maxRetries to the traced invocationParams in LangSmith though.

langchain-ai / langchainjs