deepset-ai / hayhooks

Deploy Haystack pipelines behind a REST Api.
https://haystack.deepset.ai
Apache License 2.0
40 stars 12 forks source link

Hayhook with custom haystack components. #36

Open satyakisen opened 2 months ago

satyakisen commented 2 months ago

Hi Team,

I am trying to call rest api to run a pipeline with multiple custom component. I could not find any example for the same in the hayhook repository. It will be helpful if some examples are provided for the above use case.

Thanks in advance.

vblagoje commented 2 months ago

How far have you gotten and where exactly did you face issues @satyakisen ?

satyakisen commented 1 month ago

@vblagoje I am able to use basic custom haystack components, by installing my component through poetry. But in my pipeline when there is some kind of python object like, haystack Token secrets or chat message, the pipeline dump is generating a tag like !!python/object:haystack.dataclasses.chat_message.ChatMessage.

While deploying this yaml file in the hayhook, its throwing yaml parsing error. Can you please help me with the above error.

I could see this is because of the yaml safe_load function which is restricting the deserialization of haystack python objects. Is there any way I could use a custom marshaller while deserializing the yaml file?

Elaborating the above issue for better clarification.

Overview

We can dump a haystack pipeline to a yaml file and later load the same yaml file and run the respective pipeline as per Haystack Documentation.

In this experiment we are using the Haystack out of the box component (ChatPromptBuilder, ChatMessage).

Running the experiment we find that though we are able to serialize the pipeline, while deserializing it is throwing some error.

Reproduction

One can reproduce the error by copying the Pipeline Codebase onto pipeline.py file and then running the below command:

# Ran on windows git bash
python.exe pipeline.py > ./error_msg.txt 2>&1

Codebase

Pipeline Codebase

from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.dataclasses import ChatMessage

def create_prompt_builder():
    template: str = """
            Query: {{query}}

            Instruction:
                {{instruction}}

            Context:
            {% for document in documents: %}
                    {{document}}
            {% endfor %}
        """
    return ChatPromptBuilder(template=[ChatMessage.from_user(template)])

def dump() -> None:
    pipeline = Pipeline()
    prompt_builder = create_prompt_builder()
    pipeline.add_component('prompt_builder', prompt_builder)

    with open("./yamls/test_pipeline_001.yml", "w") as file:
        pipeline.dump(file)

def load() -> None:
    pipeline = Pipeline()
    with open("./yamls/test_pipeline_001.yml", "r") as file:
        pipeline.load(file)

if __name__ == '__main__':
    dump()
    load()

Pipeline YAML content

components:
  prompt_builder:
    init_parameters:
      required_variables: []
      template:
      - !!python/object:haystack.dataclasses.chat_message.ChatMessage
        content: "\n            Query: {{query}}\n\n            Instruction:\n   \
          \             {{instruction}}\n                               \n\n     \
          \       Context:\n            {% for document in documents: %}\n       \
          \             {{document}}\n            {% endfor %}\n        "
        meta: {}
        name: null
        role: !!python/object/apply:haystack.dataclasses.chat_message.ChatRole
        - user
      variables: null
    type: haystack.components.builders.chat_prompt_builder.ChatPromptBuilder
connections: []
max_loops_allowed: 100
metadata: {}

Error while deserializing

Traceback (most recent call last):
  File "C:\Project\POC\ML\GENAI\Haystack\experiment\pipeline.py", line 37, in <module>
    load()
  File "C:\Project\POC\ML\GENAI\Haystack\experiment\pipeline.py", line 33, in load
    pipeline.load(file)
  File "C:\Python\envs\user\Lib\site-packages\haystack\core\pipeline\base.py", line 258, in load
    return cls.from_dict(marshaller.unmarshal(fp.read()), callbacks)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\envs\user\Lib\site-packages\haystack\marshal\yaml.py", line 17, in unmarshal
    return yaml.safe_load(data_)
           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\envs\user\Lib\site-packages\yaml\__init__.py", line 125, in safe_load
    return load(stream, SafeLoader)
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\envs\user\Lib\site-packages\yaml\__init__.py", line 81, in load
    return loader.get_single_data()
           ^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\envs\user\Lib\site-packages\yaml\constructor.py", line 51, in get_single_data
    return self.construct_document(node)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\envs\user\Lib\site-packages\yaml\constructor.py", line 60, in construct_document
    for dummy in generator:
  File "C:\Python\envs\user\Lib\site-packages\yaml\constructor.py", line 408, in construct_yaml_seq
    data.extend(self.construct_sequence(node))
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\envs\user\Lib\site-packages\yaml\constructor.py", line 129, in construct_sequence
    return [self.construct_object(child, deep=deep)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\envs\user\Lib\site-packages\yaml\constructor.py", line 129, in <listcomp>
    return [self.construct_object(child, deep=deep)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\envs\user\Lib\site-packages\yaml\constructor.py", line 100, in construct_object
    data = constructor(self, node)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Python\envs\user\Lib\site-packages\yaml\constructor.py", line 427, in construct_undefined
    raise ConstructorError(None, None,
yaml.constructor.ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:python/object:haystack.dataclasses.chat_message.ChatMessage'
  in "<unicode string>", line 6, column 9:
          - !!python/object:haystack.datacla ... 
            ^

Requirements

python = "^3.11" haystack-ai = "2.2.3"

ParseDark commented 1 month ago

same issue on here. When i try to use a custom component on my pipeline. hayhooks will throw an error

hayhooks deploy yml/start.yml
Error deploying pipeline: Unable to parse Haystack Pipeline start: Component '__main__.OpenAIFormatConverter' not imported.

Not sure. I think the haystack team does not want to maintain this project anymore. They just want to add more features to the haystack. But they forget one thing, deployment is more important because haystack pipeline if just created pipeline it is just a toy. Only the pipeline can deploy on the prod server, that's the main target.

jimjones26 commented 1 month ago

I wanted to comment and say I am having the same issue. A pipeline which uses custom components will not succeed when trying to deploy, throws the following error:

Error deploying pipeline: Unable to parse Haystack Pipeline ingestion_pipeline: Component 'custom_components.get_page_source.CustomComponent' not imported.

alex-stoica commented 3 weeks ago

@jimjones26 @ParseDark @satyakisen maybe it's not the solution that you're looking for, but running custom components in a pipeline with hayhooks containerized worked for me https://github.com/deepset-ai/hayhooks/pull/27 - also there you can find the full archived code.

Now, if you don't want to use Docker, the problem might be more difficult

alex-stoica commented 3 weeks ago

I second @ParseDark - a functional deployment code is highly important, ideally with the ability of executing multiple pipeline runs concurrently

alex-stoica commented 1 week ago

@jimjones26 When you encountered

Unable to parse Haystack Pipeline ingestion_pipeline: Component 'custom_components.get_page_source.CustomComponent' not imported.

did you also check that your custom component is decorated with @component? I've noticed the same issue when forgetting to decorate