langchain/langsmith tracing is not compatible with langsmith JS SDK traceable

mattiaseyram commented 2 months ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain.js documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain.js rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

import { traceable } from 'langsmith/traceable';
import { RunnableLambda } from '@langchain/core/runnables';

const exampleRunnable = RunnableLambda.from(async (foo: string) =>
  foo.toUpperCase(),
);

exampleRunnable.name = 'exampleRunnable';

const exampleTraceable = traceable(
  async (foo: string) => {
    const result = await exampleRunnable.invoke(foo);

    console.log(result);

    return result;
  },
  {
    name: 'exampleTraceable',
  },
);

exampleTraceable('bar');

Error Message and Stack Trace (if applicable)

No response

Description

Any chains invoked inside a function wrapped with traceable have separate trace ids (or vice versa, with chains calling traceable functions internally).

Ideally the trace id gets propagated automatically.

System Info

langchain: latest platform: mac node: v20.11.1 pnpm: 8.15.5

dosubot[bot] commented 2 months ago

To address the issue of trace ID propagation in LangChain.js when using traceable functions and invoking chains, it's essential to ensure that the trace ID is consistently passed through all operations. This can be achieved by implementing a mechanism that explicitly carries the trace ID across different layers of your application, especially when dealing with asynchronous operations or callbacks.

For operations wrapped with traceable, you should ensure that any chain invocation within these functions explicitly receives the current trace ID. This might involve modifying the traceable wrapper or the functions it wraps to accept and forward a trace context that includes the trace ID. Similarly, when chains call traceable functions internally, they should pass along the trace context they're operating within.

In practical terms, this could mean adjusting the signature of your functions to include a trace context parameter or leveraging a context propagation library that automatically handles this for you, depending on the execution environment of your JavaScript/TypeScript application.

For RunnableLambda and similar constructs, consider enhancing their invocation mechanism to accept and propagate trace context. This ensures that when these runnables are executed, they're aware of the trace ID and can include it in any logging or tracing operations they perform.

This approach requires a consistent implementation across all parts of your system that participate in tracing to ensure that the trace ID is not lost at any point in the execution chain. By doing so, you'll achieve a seamless traceability experience that allows you to monitor and debug your application's behavior more effectively.

If you're implementing custom logging or tracing, ensure that every log entry or trace event includes the trace ID. This might involve adjusting your logging framework or tracing library configurations to automatically include this information based on the current execution context.

By focusing on these key areas, you should be able to resolve the issue with trace ID propagation in your LangChain.js application, ensuring that all related operations are correctly grouped under a single trace ID for easier monitoring and debugging.

Sources

#### About Dosu

This response is meant to be useful and save you time. It is **not meant to be a precise solution**, but rather a starting point for your own research. Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant. If you want to continue the conversation, start your reply with @dosu-bot.

mattiaseyram commented 2 months ago

Update: I'm using a custom runnable lambda class now to meet my needs:

/* eslint-disable no-unused-vars */
import {
  RunnableConfig,
  RunnableLambda,
  patchConfig,
} from '@langchain/core/runnables';
import { CallbackManagerForChainRun } from 'langchain/callbacks';

export const DEFAULT_RECURSION_LIMIT = 25;

export type MyRunnableFuncOptions = {
  config?: RunnableConfig;
  internalInvokeConfig?: Partial<RunnableConfig>;
} & RunnableConfig;

export type MyRunnableFunc<RunInput, RunOutput> = (
  input: RunInput,
  options?: MyRunnableFuncOptions,
) => Promise<RunOutput>;

export class MyRunnableLambda<RunInput, RunOutput> extends RunnableLambda<
  RunInput,
  RunOutput
> {
  static myFrom<RunInput, RunOutput>(
    func: MyRunnableFunc<RunInput, RunOutput>,
    name?: string,
  ): RunnableLambda<RunInput, RunOutput> {
    const myRunnableLambda = new MyRunnableLambda({ func });
    myRunnableLambda.name = name || func.name;
    return myRunnableLambda;
  }

  async __invoke(
    input: RunInput,
    config?: Partial<RunnableConfig>,
    runManager?: CallbackManagerForChainRun,
  ): Promise<RunOutput> {
    const internalInvokeConfig = patchConfig(config, {
      callbacks: runManager?.getChild(),
      recursionLimit: (config?.recursionLimit ?? DEFAULT_RECURSION_LIMIT) - 2,
    });

    return this._invoke(
      input,
      {
        ...config,
        // @ts-ignore
        internalInvokeConfig,
      },
      runManager,
    );
  }

  async invoke(
    input: RunInput,
    options?: Partial<RunnableConfig>,
  ): Promise<RunOutput> {
    return this._callWithConfig(this.__invoke, input, options);
  }
}

// const myInternalRunnableLambda = MyRunnableLambda.myFrom<string, string>(
//   async (input, options) => {
//     console.log('myInternalRunnableLambda input:', input);
//     return input.toUpperCase();
//   },
//   'myInternalRunnableLambda',
// );

// const myRunnableLambda = MyRunnableLambda.myFrom<[string, string], string>(
//   async (input: [string, string], options) => {
//     console.log('myRunnableLambda input:', input);
//     const internalResult = await myInternalRunnableLambda.invoke(
//       input[0],
//       options?.internalInvokeConfig,
//     );
//     return `${internalResult} ${input[1]}!`;
//   },
//   'myRunnableLambda',
// );

// const result = await myRunnableLambda.invoke(['hello', 'world']);

jacoblee93 commented 2 months ago

Thanks for flagging - will have a look!

loui7 commented 2 months ago

Same issue with RunTree's, using chain.invoke inside creates a new trace. Interestingly they seem to re-use the traceId when inside a LangGraph node

loui7 commented 2 months ago

Also interesting to note, if you set metadata at the end of a runnable it will force a new trace to be created when inside a LangGraph node

i.e.

  const response = await ChatPromptTemplate.fromMessages([
    ['human', `When was {name} born?`],
  ])
    .pipe(llm)
    .pipe(parser)
    .invoke(
      { name: 'Albert Einstein' },
      {
        metadata: {
          example: 'metadata'
        }
      },
    );

langchain-ai / langchainjs