Issue with RemoteRunnable and custom 'LangServe' endpoint... The issue seems to be in reviving the results of my BaseMessages

mcgear commented 2 weeks ago

Checked other resources

[X] I added a very descriptive title to this issue.
[X] I searched the LangChain.js documentation with the integrated search.
[X] I used the GitHub search to find a similar question and didn't find it.
[X] I am sure that this is a bug in LangChain.js rather than my code.
[X] The bug is not resolved by updating to the latest stable version of LangChain (or the specific integration package).

Example Code

The following code:

console.log(JSON.parse(JSON.stringify(new HumanMessage('This is a test'))));

Error Message and Stack Trace (if applicable)

Not so much an error message, as i don't think i am getting the right ouput from serialization: C:\ProgramData\chocolatey\lib\deno\deno.exe run --config ./deno.jsonc --inspect-wait --inspect-brk=127.0.0.1:53819 --allow-all .\main.ts Initialized esbuild with standard build. Unable to connect to the EaC service, falling back to local config. Initializing DenoKV at ./denoKv/thinky.db Loading DenoKV instance for ./denoKv/thinky.db Initializing DenoKV at ./denoKv/thinky.db Loading DenoKV instance for ./denoKv/thinky.db Inititialized DenoKV database: ./denoKv/thinky.db Inititialized DenoKV database: ./denoKv/thinky.db projectGraph: 0 ms Timer 'azureContainerCheck' already exists azureContainerCheck: 0 ms Timer 'azureContainerCheck' does not exist circuits: 0 ms thinky: 0 ms appGraph: 0 ms 🐙 EaC Runtime Ready http://localhost:6132 {"lc":1,"type":"constructor","id":["langchain_core","messages","HumanMessage"],"kwargs":{"content":"This is a test","additional_kwargs":{},"response_metadata":{}}} {lc: 1, type: 'constructor', id: Array(3), kwargs: {…}}

As you can see, the type is set to 'constructor': "type":"constructor"

The issue is that when i look at the logic for RemoteRunnable: https://github.com/langchain-ai/langchainjs/blob/9822a5c6d4a5e01b7f4746114696d81670c9331e/langchain-core/src/runnables/remote.ts#L259

It requires the 'type' to be the class name of the instance: https://github.com/langchain-ai/langchainjs/blob/9822a5c6d4a5e01b7f4746114696d81670c9331e/langchain-core/src/runnables/remote.ts#L62

As you can see, the type is always constructor, so when the remote tries to unpack it, it see string 'constructor' instead of something like 'HumanMessage' or 'ToolMessageChunk'... So i feel like i am missing something in our attempt to replicate LangServe, in the attemp to provide our own hosts that can meet the RemoteRunnable interface

Description

Maybe i put to much in the error message section...

System Info

deno https://github.com/fathym-deno/thinky-runtime

dosubot[bot] commented 2 weeks ago

To resolve the issue with the RemoteRunnable interface where the type is set to 'constructor' instead of the class name, you need to ensure that the _getType method in your message classes returns the correct class name. This method is crucial for the serialization process to correctly identify the type of the message.

Here is how you can implement the _getType method in your message classes:

class HumanMessage extends BaseMessage {
  _getType() {
    return "HumanMessage";
  }
}

class ToolMessageChunk extends BaseMessage {
  _getType() {
    return "ToolMessageChunk";
  }
}

// Repeat for other message types...

Make sure that each subclass of BaseMessage overrides the _getType method to return its specific class name. This will ensure that the type property is correctly set during serialization.

Additionally, verify that the serialize function in your code uses this _getType method correctly:

function serialize<RunInput>(input: RunInput): any {
  if (Array.isArray(input)) return input.map(serialize);
  if (isBaseMessage(input)) {
    const serializedMessage: Record<string, any> = {
      content: input.content,
      type: input._getType(), // Ensure this returns the correct class name
      additional_kwargs: input.additional_kwargs,
      name: input.name,
      example: false,
    };
    if (ToolMessage.isInstance(input)) {
      serializedMessage.tool_call_id = input.tool_call_id;
    } else if (ChatMessage.isInstance(input)) {
      serializedMessage.role = input.role;
    }
    return serializedMessage;
  }
  if (typeof input === "object") {
    if (!input || input instanceof Date) {
      return input;
    }
    const keysArr = Object.keys(input);
    const innerSerialize: (key: string) => [string, any] = (key: string) => [
      key,
      serialize((input as Record<string, unknown>)[key]),
    ];
    const rtn = Object.fromEntries(keysArr.map(innerSerialize));
    return rtn;
  }
  return input;
}

By ensuring that the _getType method returns the correct class name, the type property will be set correctly during serialization, allowing the RemoteRunnable interface to properly revive the results of BaseMessages ^[1].

If you have custom message types, make sure they also implement the _getType method appropriately. This will help in correctly identifying and handling these custom types during the serialization and deserialization process.

To continue talking to Dosu, mention @dosu.

mcgear commented 2 weeks ago

Should i have to do this when using the built in message classes like HumanMessasge @dosu?

dosubot[bot] commented 2 weeks ago

No, you do not need to implement the _getType method for built-in message classes like HumanMessage in LangChain.js, as it is already implemented in the provided code.

/**
 * Represents a human message in a conversation.
 */
export class HumanMessage extends BaseMessage {
  static lc_name() {
    return "HumanMessage";
  }

  _getType(): MessageType {
    return "human";
  }
}

The _getType method is already defined and returns the correct type field during serialization. Additionally, the BaseMessage class, which HumanMessage extends, includes the necessary structure and methods to handle message serialization and type management, ensuring that the type field is correctly set ^[1]^[2].

To continue talking to Dosu, mention @dosu.

mcgear commented 2 weeks ago

The issue seems to be in the serialization toJSON implementation: https://github.com/langchain-ai/langchainjs/blob/ac204f845ba4a94a8f446cfc5d3906ff776af858/langchain-core/src/load/serializable.ts#L211

It seems like this code needs to be updated to return the class name, or use the lc_name as the type, in order to match up with the revive logic in the RemoteRunnable.

mcgear commented 4 days ago

I'm now getting some issue from this line: https://github.com/langchain-ai/langchainjs/blob/9822a5c6d4a5e01b7f4746114696d81670c9331e/langchain-core/src/runnables/remote.ts#L185

Reading the comment:

if (isSuperset(keys, new Set(["messages"]))) {
      // TODO: Start checking for type: ChatPromptValue and ChatPromptValueConcrete
      // when LangServe bug is fixed
      return new ChatPromptValue({
        // eslint-disable-next-line @typescript-eslint/no-explicit-any
        messages: obj.messages.map((msg: any) => revive(msg)),
      });
    }

I'm wondering what bug in LangServe is being referenced here, and how to deal with the error i am getting:

remote.js:162 Uncaught (in promise) TypeError: obj.messages.map is not a function
    at revive (remote.js:162:40)
    at innerRevive (remote.js:173:13)
    at Array.map (<anonymous>)
    at revive (remote.js:175:48)
    at innerRevive (remote.js:173:13)
    at Array.map (<anonymous>)
    at revive (remote.js:175:48)
    at innerRevive (remote.js:173:13)
    at Array.map (<anonymous>)
    at revive (remote.js:175:48)

langchain-ai / langchainjs