otto8-ai / otto8

Open source AI Agent Platform
Apache License 2.0
19 stars 12 forks source link

Knowledge - Ingestion failure with "failed to clean website content" error #386

Open sangee2004 opened 4 weeks ago

sangee2004 commented 4 weeks ago

Steps to reproduce the problem:

  1. Create an agent with knowledge from acorn.io
  2. Once all files are scarped , ingest all files.

Ingestion of 1 file fails

Screenshot 2024-10-30 at 5 43 54 PM
cjellick commented 2 weeks ago

@sangee2004 if you see this, we need the debug details

sangee2004 commented 2 weeks ago

Hit this error for one of the files - www.acorn.io/resources/blog/introducing-gptscript.md when ingesting acorn.io website.

Agent - https://test.otto8.ai/api/agents/a18bbqq/ Failed run:

    {
      "id": "e49a89d4b24774ba6984a074e4c11e859e60e3a640fe2673a3a0e187ec0fd5c6",
      "created": "2024-11-12T17:42:59Z",
      "fileName": "www.acorn.io/resources/blog/introducing-gptscript.md",
      "state": "error",
      "error": "failed to clean website content: failed to run: unexpected EOF",
      "agentID": "a18bbqq",
      "knowledgeSetID": "kst1-a18bbqq",
      "knowledgeSourceID": "ks1nzntb",
      "approved": true,
      "url": "https://www.acorn.io/resources/blog/introducing-gptscript",
      "updatedAt": "2024-11-12T17:42:57Z",
      "checksum": "021037c0dc441d5f3725ea6ac361d09d03afa284cfd31ba2d47d996a7a49448c",
      "lastIngestionStartTime": "2024-11-12T17:43:18Z",
      "lastIngestionEndTime": "2024-11-12T17:43:52Z",
      "lastRunIDs": [
        "r169f49"
      ],
      "sizeInBytes": 9700
    },

Debug log for the run:

{
  "frames": {
    "1731433556": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "",
      "end": "0001-01-01T00:00:00Z",
      "id": "1731433556",
      "input": "[![](https://grateful-confidence-e8d3628efb.media.strapiapp.com/acorn_logo_h_5b9f9fcaf6.svg)](/)\n\n- [Resources](/resources)\n\n\n\n\n\n\n\n  [Resources](/resources)\n\n  [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/stand_178f648fb8.svg)Tutorials](/resources/tutorials) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/rocket_02_235d3f1a85.svg)Events](/events) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/book_open_01_c8e9ccba00.svg)Blog](/resources/blog) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/file_attachment_04_b6e417fe08.svg)Docs](https://docs.gptscript.ai/) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/grid_01_2f32e65e18.svg)Tools](https://tools.gptscript.ai/) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/Discord_02_dfc0910378.svg)Discord](https://discord.com/invite/9sSf4UyAMC) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/8666686_github_icon_1_6190717ec3.svg)Github](https://github.com/gptscript-ai/gptscript)\n\n\n\n\n\n\n\n\n\n  [Learning Center](/resources/learning-center)\n\n\n\n  [Models](/resources/learning-center?category=23)\n\n  * * *\n\n\n\n- [OpenAI GPT-4](/resources/learning-center/openai)\n- [Anthropic Claude](/resources/learning-center/anthropic-claude)\n- [Cohere AI](/resources/learning-center/cohere-ai)\n- [Google Gemini](/resources/learning-center/google-gemini)\n- [Meta Llama](/resources/learning-center/meta-llama)\n- [Mistral](/resources/learning-center/mistral-ai)\n- [Mistral 7b](/resources/learning-center/mistral-7b)\n\n[Tools and Topics](/resources/learning-center?category=24)\n\n* * *\n\n- [Fine Tuning LLM](/resources/learning-center/fine-tuning-llm)\n- [Generative AI Apps](/resources/learning-center/generative-ai-applications)\n- [AI Agents](/resources/learning-center/ai-agents)\n- [Claude API](/resources/learning-center/claude-api)\n- [Google Gemini API](/resources/learning-center/google-gemini-api)\n- [LLM Application Development](/resources/learning-center/llm-application-development)\n- [LLM Security](/resources/learning-center/llm-security)\n- [Prompt Engineering](/resources/learning-center/prompt-engineering)\n\n[Use Cases](/resources/learning-center?category=25)\n\n* * *\n\n- [Retrieval Augmented Generation (RAG)](/resources/learning-center/retrieval-augmented-generation)\n- [AI Copilots](/resources/learning-center/ai-copilots)\n- [AI Image Generation](/resources/learning-center/ai-image-generation)\n- [AI Video Generators](/resources/learning-center/ai-video-generators)\n- [AI Summarization](/resources/learning-center/ai-summarization)\n- [Code Interpretation](/resources/learning-center/code-interpreter)\n\n[Explore All Articles](/resources/learning-center)\n\n- [Docs](http://docs.gptscript.ai)\n- [Tools](http://tools.gptscript.ai)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [GitHub](https://github.com/gptscript-ai/gptscript)\n- [Company](/about-us)\n\n[Try GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release) [Request a Demo](/contact)\n\n![Icon Nav Burger](/assets/icons/iconNavBurger.svg)\n\nBlog\n\n# Introducing GPTScript ...Officially\n\n###### Mar 15, 2024 byDarren Shepherd\n\n![icon link to Reddit](/assets/icons/iconReddit.svg)![icon link to Facebook](/assets/icons/iconFacebook.svg)![icon link to Linkedin](/assets/icons/iconLinkedin.svg)![icon link to Twitter](/assets/icons/iconXBlack.svg)\n\nLast month I introduced GPTScript with a [tweet](https://twitter.com/ibuildthecloud/status/1757789265264796003) and it’s been exciting to see how it’s captured people’s imagination. But other than that tweet and the readme on our [GitHub repo](https://github.com/gptscript-ai/gptscript), I never wrote a proper blog introducing GPTScript, which feels like an oversight. I am incredibly excited about the potential of GPTScript, and at Acorn, we've [decided](/blog/our-new-focus-developing-an-llm-app-platform-based-on-gpt-script-technology) to put all of our focus on building solutions based on it going forward.\n\nGPTScript started as an experiment to see what would happen if we took a natural language only approach to programming. Programming by only writing English or your native language like French or Chinese. What came of this effort shocked us. While a completely pure natural language experience is still some ways off, what we did discover is an elegantly simple model of blending AI with traditional systems, data, and code by starting first with natural language and then blending in code.\n\nGPTScript works by writing tools. A tool is either a prompt, which is pure AI, or some code. Tools can then be linked together with AI, AI deciding when the tool should be invoked and with what arguments. Below is a simple tool that is a prompt. That tool is then given the tool “sys.http.html2text” that will download a URL and turn it into readable text.\n\n**_Related Content:_** [A Guide to Prompt Engineering](https://www.acorn.io/resources/learning-center/prompt-engineering).\n\nCreate a file named announce.gpt containing the following:\n\n```yaml\n\ntools: sys.http.html2text\n\ncan you read https://raw.githubusercontent.com/gptscript-ai/gptscript/main/README.md\nand then write an announcement blog for GPTScript\n```\n\nAI will determine that the sys.http.html2text tool needs to be invoked with the referenced URL, will invoke it, get the response, and then complete the prompt.\n\nAs this is an AI project it seems fitting that we should run the above tool with GPTScript and let it announce itself to the world.\n\n```javascript\n\n$ gptscript announce.gpt\n```\n\n* * *\n\n# Introducing GPTScript: Revolutionizing Programming with Natural Language\n\nWe are thrilled to announce the launch of GPTScript, a cutting-edge scripting language designed to redefine the way we interact with Large Language Models (LLMs), particularly with OpenAI's powerful engines (Related Content: check out our [guide to the top LLMs](https://www.acorn.io/resources/learning-center/best-llm)). GPTScript embodies the future of programming by merging the simplicity of natural language with the robustness of traditional scripting, creating a seamless and user-friendly programming experience.\n\n**Why GPTScript?**\n\nIn today's fast-paced digital world, the ability to quickly and efficiently automate tasks, analyze data, and build intelligent applications is more crucial than ever. GPTScript rises to this challenge by offering a programming language that is intuitive, versatile, and incredibly easy to learn, even for those without a background in coding.\n\n**Key Features:**\n\n- **Ease of Use:** GPTScript's syntax is primarily based on natural language, making it accessible to a wide range of users.\n- **Versatility:** Whether it's automating tasks, performing data analysis, or integrating with external services, GPTScript is equipped to handle a variety of use cases.\n- **Integration:** GPTScript allows for seamless integration with traditional scripts (e.g., bash, python) and external HTTP services, expanding its capabilities and applications.\n\n**Exciting Use Cases:**\n\nGPTScript opens up a world of possibilities. Here are just a few examples of what you can achieve with GPTScript:\n\n- Automate complex tasks with ease, from planning vacations to managing databases.\n- Create intelligent agents and assistants that can perform a wide range of functions.\n- Conduct sophisticated data analysis and visualization.\n- Develop applications with capabilities in vision, image, and audio processing.\n\n**Get Started with GPTScript:**\n\nReady to dive into GPTScript? Here's how you can get started:\n\n1. **Installation:** GPTScript is available for macOS, Linux, and Windows. You can install it using package managers like Homebrew, Scoop, and WinGet, or download it directly from our [releases page](https://github.com/gptscript-ai/gptscript/releases).\n2. **API Key:** Obtain an API key from [OpenAI](https://platform.openai.com/api-keys) to enable your scripts to communicate with OpenAI's LLMs.\n3. **Hello World:** Test your setup with a simple \"Hello, World!\" script to see GPTScript in action.\n\n**Join Our Community:**\n\nBe part of the GPTScript revolution! Join our [Discord community](https://discord.gg/9sSf4UyAMC) to connect with other GPTScript enthusiasts, share your projects, and get support from the team behind GPTScript.\n\n**About Us:**\n\nGPTScript is brought to you by Acorn Labs, Inc., a team dedicated to pushing the boundaries of AI and programming. Our mission is to make powerful technology accessible to everyone, and GPTScript is a significant step towards achieving that goal.\n\nGet ready to experience programming like never before with GPTScript. Start building, automating, and innovating today!\n\n* * *\n\n- [About Us](/about-us)\n- [Contact Us](/contact )\n- [Tutorials](/resources/tutorials)\n- [Blog](/resources/blog)\n- [Events](/events)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [Open Source](/resources/blog/open-source)\n\n* * *\n\nLoading...\n\nTo unsubscribe at any time please see our [Privacy Policy](/privacy-policy).\n\n* * *\n\nGet Started with GPTScript [Install GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release)\n\nCopyright © 2024. All rights reserved. Acorn Labs, Inc.\n\n[Terms of Service](/terms-of-use)\n\n[![social media logo](/assets/icons/ghFooter.svg)](https://github.com/gptscript-ai/gptscript)[![social media logo](/assets/icons/twitterFooter.svg)](https://twitter.com/acornlabs)[![social media logo](/assets/icons/youtubeFooter.svg)](https://www.youtube.com/c/AcornLabs)[![social media logo](/assets/icons/inFooter.svg)](https://www.linkedin.com/company/acorn-io)",
      "inputContext": null,
      "llmRequest": {
        "chatCompletion": {
          "messages": [
            {
              "content": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
              "role": "system"
            },
            {
              "content": "[![](https://grateful-confidence-e8d3628efb.media.strapiapp.com/acorn_logo_h_5b9f9fcaf6.svg)](/)\n\n- [Resources](/resources)\n\n\n\n\n\n\n\n  [Resources](/resources)\n\n  [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/stand_178f648fb8.svg)Tutorials](/resources/tutorials) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/rocket_02_235d3f1a85.svg)Events](/events) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/book_open_01_c8e9ccba00.svg)Blog](/resources/blog) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/file_attachment_04_b6e417fe08.svg)Docs](https://docs.gptscript.ai/) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/grid_01_2f32e65e18.svg)Tools](https://tools.gptscript.ai/) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/Discord_02_dfc0910378.svg)Discord](https://discord.com/invite/9sSf4UyAMC) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/8666686_github_icon_1_6190717ec3.svg)Github](https://github.com/gptscript-ai/gptscript)\n\n\n\n\n\n\n\n\n\n  [Learning Center](/resources/learning-center)\n\n\n\n  [Models](/resources/learning-center?category=23)\n\n  * * *\n\n\n\n- [OpenAI GPT-4](/resources/learning-center/openai)\n- [Anthropic Claude](/resources/learning-center/anthropic-claude)\n- [Cohere AI](/resources/learning-center/cohere-ai)\n- [Google Gemini](/resources/learning-center/google-gemini)\n- [Meta Llama](/resources/learning-center/meta-llama)\n- [Mistral](/resources/learning-center/mistral-ai)\n- [Mistral 7b](/resources/learning-center/mistral-7b)\n\n[Tools and Topics](/resources/learning-center?category=24)\n\n* * *\n\n- [Fine Tuning LLM](/resources/learning-center/fine-tuning-llm)\n- [Generative AI Apps](/resources/learning-center/generative-ai-applications)\n- [AI Agents](/resources/learning-center/ai-agents)\n- [Claude API](/resources/learning-center/claude-api)\n- [Google Gemini API](/resources/learning-center/google-gemini-api)\n- [LLM Application Development](/resources/learning-center/llm-application-development)\n- [LLM Security](/resources/learning-center/llm-security)\n- [Prompt Engineering](/resources/learning-center/prompt-engineering)\n\n[Use Cases](/resources/learning-center?category=25)\n\n* * *\n\n- [Retrieval Augmented Generation (RAG)](/resources/learning-center/retrieval-augmented-generation)\n- [AI Copilots](/resources/learning-center/ai-copilots)\n- [AI Image Generation](/resources/learning-center/ai-image-generation)\n- [AI Video Generators](/resources/learning-center/ai-video-generators)\n- [AI Summarization](/resources/learning-center/ai-summarization)\n- [Code Interpretation](/resources/learning-center/code-interpreter)\n\n[Explore All Articles](/resources/learning-center)\n\n- [Docs](http://docs.gptscript.ai)\n- [Tools](http://tools.gptscript.ai)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [GitHub](https://github.com/gptscript-ai/gptscript)\n- [Company](/about-us)\n\n[Try GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release) [Request a Demo](/contact)\n\n![Icon Nav Burger](/assets/icons/iconNavBurger.svg)\n\nBlog\n\n# Introducing GPTScript ...Officially\n\n###### Mar 15, 2024 byDarren Shepherd\n\n![icon link to Reddit](/assets/icons/iconReddit.svg)![icon link to Facebook](/assets/icons/iconFacebook.svg)![icon link to Linkedin](/assets/icons/iconLinkedin.svg)![icon link to Twitter](/assets/icons/iconXBlack.svg)\n\nLast month I introduced GPTScript with a [tweet](https://twitter.com/ibuildthecloud/status/1757789265264796003) and it’s been exciting to see how it’s captured people’s imagination. But other than that tweet and the readme on our [GitHub repo](https://github.com/gptscript-ai/gptscript), I never wrote a proper blog introducing GPTScript, which feels like an oversight. I am incredibly excited about the potential of GPTScript, and at Acorn, we've [decided](/blog/our-new-focus-developing-an-llm-app-platform-based-on-gpt-script-technology) to put all of our focus on building solutions based on it going forward.\n\nGPTScript started as an experiment to see what would happen if we took a natural language only approach to programming. Programming by only writing English or your native language like French or Chinese. What came of this effort shocked us. While a completely pure natural language experience is still some ways off, what we did discover is an elegantly simple model of blending AI with traditional systems, data, and code by starting first with natural language and then blending in code.\n\nGPTScript works by writing tools. A tool is either a prompt, which is pure AI, or some code. Tools can then be linked together with AI, AI deciding when the tool should be invoked and with what arguments. Below is a simple tool that is a prompt. That tool is then given the tool “sys.http.html2text” that will download a URL and turn it into readable text.\n\n**_Related Content:_** [A Guide to Prompt Engineering](https://www.acorn.io/resources/learning-center/prompt-engineering).\n\nCreate a file named announce.gpt containing the following:\n\n```yaml\n\ntools: sys.http.html2text\n\ncan you read https://raw.githubusercontent.com/gptscript-ai/gptscript/main/README.md\nand then write an announcement blog for GPTScript\n```\n\nAI will determine that the sys.http.html2text tool needs to be invoked with the referenced URL, will invoke it, get the response, and then complete the prompt.\n\nAs this is an AI project it seems fitting that we should run the above tool with GPTScript and let it announce itself to the world.\n\n```javascript\n\n$ gptscript announce.gpt\n```\n\n* * *\n\n# Introducing GPTScript: Revolutionizing Programming with Natural Language\n\nWe are thrilled to announce the launch of GPTScript, a cutting-edge scripting language designed to redefine the way we interact with Large Language Models (LLMs), particularly with OpenAI's powerful engines (Related Content: check out our [guide to the top LLMs](https://www.acorn.io/resources/learning-center/best-llm)). GPTScript embodies the future of programming by merging the simplicity of natural language with the robustness of traditional scripting, creating a seamless and user-friendly programming experience.\n\n**Why GPTScript?**\n\nIn today's fast-paced digital world, the ability to quickly and efficiently automate tasks, analyze data, and build intelligent applications is more crucial than ever. GPTScript rises to this challenge by offering a programming language that is intuitive, versatile, and incredibly easy to learn, even for those without a background in coding.\n\n**Key Features:**\n\n- **Ease of Use:** GPTScript's syntax is primarily based on natural language, making it accessible to a wide range of users.\n- **Versatility:** Whether it's automating tasks, performing data analysis, or integrating with external services, GPTScript is equipped to handle a variety of use cases.\n- **Integration:** GPTScript allows for seamless integration with traditional scripts (e.g., bash, python) and external HTTP services, expanding its capabilities and applications.\n\n**Exciting Use Cases:**\n\nGPTScript opens up a world of possibilities. Here are just a few examples of what you can achieve with GPTScript:\n\n- Automate complex tasks with ease, from planning vacations to managing databases.\n- Create intelligent agents and assistants that can perform a wide range of functions.\n- Conduct sophisticated data analysis and visualization.\n- Develop applications with capabilities in vision, image, and audio processing.\n\n**Get Started with GPTScript:**\n\nReady to dive into GPTScript? Here's how you can get started:\n\n1. **Installation:** GPTScript is available for macOS, Linux, and Windows. You can install it using package managers like Homebrew, Scoop, and WinGet, or download it directly from our [releases page](https://github.com/gptscript-ai/gptscript/releases).\n2. **API Key:** Obtain an API key from [OpenAI](https://platform.openai.com/api-keys) to enable your scripts to communicate with OpenAI's LLMs.\n3. **Hello World:** Test your setup with a simple \"Hello, World!\" script to see GPTScript in action.\n\n**Join Our Community:**\n\nBe part of the GPTScript revolution! Join our [Discord community](https://discord.gg/9sSf4UyAMC) to connect with other GPTScript enthusiasts, share your projects, and get support from the team behind GPTScript.\n\n**About Us:**\n\nGPTScript is brought to you by Acorn Labs, Inc., a team dedicated to pushing the boundaries of AI and programming. Our mission is to make powerful technology accessible to everyone, and GPTScript is a significant step towards achieving that goal.\n\nGet ready to experience programming like never before with GPTScript. Start building, automating, and innovating today!\n\n* * *\n\n- [About Us](/about-us)\n- [Contact Us](/contact )\n- [Tutorials](/resources/tutorials)\n- [Blog](/resources/blog)\n- [Events](/events)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [Open Source](/resources/blog/open-source)\n\n* * *\n\nLoading...\n\nTo unsubscribe at any time please see our [Privacy Policy](/privacy-policy).\n\n* * *\n\nGet Started with GPTScript [Install GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release)\n\nCopyright © 2024. All rights reserved. Acorn Labs, Inc.\n\n[Terms of Service](/terms-of-use)\n\n[![social media logo](/assets/icons/ghFooter.svg)](https://github.com/gptscript-ai/gptscript)[![social media logo](/assets/icons/twitterFooter.svg)](https://twitter.com/acornlabs)[![social media logo](/assets/icons/youtubeFooter.svg)](https://www.youtube.com/c/AcornLabs)[![social media logo](/assets/icons/inFooter.svg)](https://www.linkedin.com/company/acorn-io)",
              "role": "user"
            }
          ],
          "model": "gpt-4o-mini",
          "temperature": 0
        },
        "toolMapping": {

        }
      },
      "llmResponse": null,
      "output": [
        {
          "content": "# Introducing GPTScript ...Officially\n\n###### Mar 15, 2024 by Darren Shepherd\n\nLast month I introduced GPTScript with a [tweet](https://twitter.com/ibuildthecloud/status/1757789265264796003) and it’s been exciting to see how it’s captured people’s imagination. But other than that tweet and the readme on our [GitHub repo](https://github.com/gptscript-ai/gptscript), I never wrote a proper blog introducing GPTScript, which feels like an oversight. I am incredibly excited about the potential of GPTScript, and at Acorn, we've [decided](/blog/our-new-focus-developing-an-llm-app-platform-based-on-gpt-script-technology) to put all of our focus on building solutions based on it going forward.\n\nGPTScript started as an experiment to see what would happen if we took a natural language only approach to programming. Programming by only writing English or your native language like French or Chinese. What came of this effort shocked us. While a completely pure natural language experience is still some ways off, what we did discover is an elegantly simple model of blending AI with traditional systems, data, and code by starting first with natural language and then blending in code.\n\nGPTScript works by writing tools. A tool is either a prompt, which is pure AI, or some code. Tools can then be linked together with AI, AI deciding when the tool should be invoked and with what arguments. Below is a simple tool that is a prompt. That tool is then given the tool “sys.http.html2text” that will download a URL and turn it into readable text.\n\n**_Related Content:_** [A Guide to Prompt Engineering](https://www.acorn.io/resources/learning-center/prompt-engineering).\n\nCreate a file named `announce.gpt` containing the following:\n\n```yaml\ntools: sys.http.html2text\n\ncan you read https://raw.githubusercontent.com/gptscript-ai/gptscript/main/README.md\nand then write an announcement blog for GPTScript\n```\n\nAI will determine that the sys.http.html2text tool needs to be invoked with the referenced URL, will invoke it, get the response, and then complete the prompt.\n\nAs this is an AI project it seems fitting that we should run the above tool with GPTScript and let it announce itself to the world.\n\n```javascript\n$ gptscript announce.gpt\n```\n\n* * *\n\n# Introducing GPTScript: Revolutionizing Programming with Natural Language\n\nWe are thrilled to announce the launch of GPTScript, a cutting-edge scripting language designed to redefine the way we interact with Large Language Models (LLMs), particularly with OpenAI's powerful engines (Related Content: check out our [guide to the top LLMs](https://www.acorn.io/resources/learning-center/best-llm)). GPTScript embodies the future of programming by merging the simplicity of natural language with the robustness of traditional scripting, creating a seamless and user-friendly programming experience.\n\n**Why GPTScript?**\n\nIn today's fast-paced digital world, the ability to quickly and efficiently automate tasks, analyze data, and build intelligent applications is more crucial than ever. GPTScript rises to this challenge by offering a programming language that is intuitive, versatile, and incredibly easy to learn, even for those without a background in coding.\n\n**Key Features:**\n\n- **Ease of Use:** GPTScript's syntax is primarily based on natural language, making it accessible to a wide range of users.\n- **Versatility:** Whether it's automating tasks, performing data analysis, or integrating with external services, GPTScript is equipped to handle a variety of use cases.\n- **Integration:** GPTScript allows for seamless integration with traditional scripts (e.g., bash, python) and external HTTP services, expanding its capabilities and applications.\n\n**Exciting Use Cases:**\n\nGPTScript opens up a world of possibilities. Here are just a few examples of what you can achieve with GPTScript:\n\n- Automate complex tasks with ease, from planning vacations to managing databases.\n- Create intelligent agents and assistants that can perform a wide range of functions.\n- Conduct sophisticated data analysis and visualization.\n- Develop applications with capabilities in vision, image, and audio processing.\n\n**Get Started with GPTScript:**\n\nReady to dive into GPTScript? Here's how you can get started:\n\n1. **Installation:** GPTScript is available for macOS, Linux, and Windows. You can install it using package managers like Homebrew, Scoop, and WinGet, or download it directly from our [releases page](https://github.com/gptscript-ai/gptscript/releases).\n2. **API Key:** Obtain an API key from [OpenAI](https://platform.openai.com/api-keys) to enable your scripts to communicate with OpenAI's LLMs.\n3. **Hello World:** Test your setup with a simple \"Hello, World!\" script",
          "subCalls": null
        }
      ],
      "start": "2024-11-12T17:43:19.097746979Z",
      "tool": {
        "description": "Removes extra header, footer, and navigation content from the markdown version of webpages",
        "id": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner",
        "instructions": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
        "internalPrompt": null,
        "localTools": {
          "website markdown content cleaner": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner"
        },
        "modelName": "gpt-4o-mini",
        "name": "Website Markdown Content Cleaner",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/website-cleaner/tool.gpt"
        },
        "workingDir": "/otto8-tools/website-cleaner"
      },
      "toolResults": 0,
      "type": "callProgress",
      "usage": {

      }
    },
    "1731433557": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "Running sys.daemon",
      "end": "2024-11-12T17:43:19.097979835Z",
      "id": "1731433557",
      "input": "",
      "inputContext": null,
      "llmRequest": null,
      "llmResponse": null,
      "output": [
        {
          "content": "http://127.0.0.1:10709",
          "subCalls": null
        }
      ],
      "start": "2024-11-12T17:43:19.097941014Z",
      "tool": {
        "description": "Model provider for Otto8",
        "id": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8 Model Provider",
        "instructions": "#!sys.daemon /usr/bin/env python3 ${GPTSCRIPT_TOOL_DIR}/main.py",
        "internalPrompt": null,
        "localTools": {
          "otto8 model provider": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8 Model Provider"
        },
        "modelName": "gpt-4o",
        "modelProvider": true,
        "name": "Otto8 Model Provider",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/otto8-model-provider/tool.gpt"
        },
        "workingDir": "/otto8-tools/otto8-model-provider"
      },
      "toolCategory": "provider",
      "toolResults": 0,
      "type": "callFinish",
      "usage": {

      }
    }
  },
  "spec": {
    "synchronous": true,
    "threadName": "t1-ks1nzntb",
    "input": "[![](https://grateful-confidence-e8d3628efb.media.strapiapp.com/acorn_logo_h_5b9f9fcaf6.svg)](/)\n\n- [Resources](/resources)\n\n\n\n\n\n\n\n  [Resources](/resources)\n\n  [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/stand_178f648fb8.svg)Tutorials](/resources/tutorials) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/rocket_02_235d3f1a85.svg)Events](/events) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/book_open_01_c8e9ccba00.svg)Blog](/resources/blog) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/file_attachment_04_b6e417fe08.svg)Docs](https://docs.gptscript.ai/) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/grid_01_2f32e65e18.svg)Tools](https://tools.gptscript.ai/) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/Discord_02_dfc0910378.svg)Discord](https://discord.com/invite/9sSf4UyAMC) [![icon link](https://grateful-confidence-e8d3628efb.media.strapiapp.com/8666686_github_icon_1_6190717ec3.svg)Github](https://github.com/gptscript-ai/gptscript)\n\n\n\n\n\n\n\n\n\n  [Learning Center](/resources/learning-center)\n\n\n\n  [Models](/resources/learning-center?category=23)\n\n  * * *\n\n\n\n- [OpenAI GPT-4](/resources/learning-center/openai)\n- [Anthropic Claude](/resources/learning-center/anthropic-claude)\n- [Cohere AI](/resources/learning-center/cohere-ai)\n- [Google Gemini](/resources/learning-center/google-gemini)\n- [Meta Llama](/resources/learning-center/meta-llama)\n- [Mistral](/resources/learning-center/mistral-ai)\n- [Mistral 7b](/resources/learning-center/mistral-7b)\n\n[Tools and Topics](/resources/learning-center?category=24)\n\n* * *\n\n- [Fine Tuning LLM](/resources/learning-center/fine-tuning-llm)\n- [Generative AI Apps](/resources/learning-center/generative-ai-applications)\n- [AI Agents](/resources/learning-center/ai-agents)\n- [Claude API](/resources/learning-center/claude-api)\n- [Google Gemini API](/resources/learning-center/google-gemini-api)\n- [LLM Application Development](/resources/learning-center/llm-application-development)\n- [LLM Security](/resources/learning-center/llm-security)\n- [Prompt Engineering](/resources/learning-center/prompt-engineering)\n\n[Use Cases](/resources/learning-center?category=25)\n\n* * *\n\n- [Retrieval Augmented Generation (RAG)](/resources/learning-center/retrieval-augmented-generation)\n- [AI Copilots](/resources/learning-center/ai-copilots)\n- [AI Image Generation](/resources/learning-center/ai-image-generation)\n- [AI Video Generators](/resources/learning-center/ai-video-generators)\n- [AI Summarization](/resources/learning-center/ai-summarization)\n- [Code Interpretation](/resources/learning-center/code-interpreter)\n\n[Explore All Articles](/resources/learning-center)\n\n- [Docs](http://docs.gptscript.ai)\n- [Tools](http://tools.gptscript.ai)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [GitHub](https://github.com/gptscript-ai/gptscript)\n- [Company](/about-us)\n\n[Try GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release) [Request a Demo](/contact)\n\n![Icon Nav Burger](/assets/icons/iconNavBurger.svg)\n\nBlog\n\n# Introducing GPTScript ...Officially\n\n###### Mar 15, 2024 byDarren Shepherd\n\n![icon link to Reddit](/assets/icons/iconReddit.svg)![icon link to Facebook](/assets/icons/iconFacebook.svg)![icon link to Linkedin](/assets/icons/iconLinkedin.svg)![icon link to Twitter](/assets/icons/iconXBlack.svg)\n\nLast month I introduced GPTScript with a [tweet](https://twitter.com/ibuildthecloud/status/1757789265264796003) and it’s been exciting to see how it’s captured people’s imagination. But other than that tweet and the readme on our [GitHub repo](https://github.com/gptscript-ai/gptscript), I never wrote a proper blog introducing GPTScript, which feels like an oversight. I am incredibly excited about the potential of GPTScript, and at Acorn, we've [decided](/blog/our-new-focus-developing-an-llm-app-platform-based-on-gpt-script-technology) to put all of our focus on building solutions based on it going forward.\n\nGPTScript started as an experiment to see what would happen if we took a natural language only approach to programming. Programming by only writing English or your native language like French or Chinese. What came of this effort shocked us. While a completely pure natural language experience is still some ways off, what we did discover is an elegantly simple model of blending AI with traditional systems, data, and code by starting first with natural language and then blending in code.\n\nGPTScript works by writing tools. A tool is either a prompt, which is pure AI, or some code. Tools can then be linked together with AI, AI deciding when the tool should be invoked and with what arguments. Below is a simple tool that is a prompt. That tool is then given the tool “sys.http.html2text” that will download a URL and turn it into readable text.\n\n**_Related Content:_** [A Guide to Prompt Engineering](https://www.acorn.io/resources/learning-center/prompt-engineering).\n\nCreate a file named announce.gpt containing the following:\n\n```yaml\n\ntools: sys.http.html2text\n\ncan you read https://raw.githubusercontent.com/gptscript-ai/gptscript/main/README.md\nand then write an announcement blog for GPTScript\n```\n\nAI will determine that the sys.http.html2text tool needs to be invoked with the referenced URL, will invoke it, get the response, and then complete the prompt.\n\nAs this is an AI project it seems fitting that we should run the above tool with GPTScript and let it announce itself to the world.\n\n```javascript\n\n$ gptscript announce.gpt\n```\n\n* * *\n\n# Introducing GPTScript: Revolutionizing Programming with Natural Language\n\nWe are thrilled to announce the launch of GPTScript, a cutting-edge scripting language designed to redefine the way we interact with Large Language Models (LLMs), particularly with OpenAI's powerful engines (Related Content: check out our [guide to the top LLMs](https://www.acorn.io/resources/learning-center/best-llm)). GPTScript embodies the future of programming by merging the simplicity of natural language with the robustness of traditional scripting, creating a seamless and user-friendly programming experience.\n\n**Why GPTScript?**\n\nIn today's fast-paced digital world, the ability to quickly and efficiently automate tasks, analyze data, and build intelligent applications is more crucial than ever. GPTScript rises to this challenge by offering a programming language that is intuitive, versatile, and incredibly easy to learn, even for those without a background in coding.\n\n**Key Features:**\n\n- **Ease of Use:** GPTScript's syntax is primarily based on natural language, making it accessible to a wide range of users.\n- **Versatility:** Whether it's automating tasks, performing data analysis, or integrating with external services, GPTScript is equipped to handle a variety of use cases.\n- **Integration:** GPTScript allows for seamless integration with traditional scripts (e.g., bash, python) and external HTTP services, expanding its capabilities and applications.\n\n**Exciting Use Cases:**\n\nGPTScript opens up a world of possibilities. Here are just a few examples of what you can achieve with GPTScript:\n\n- Automate complex tasks with ease, from planning vacations to managing databases.\n- Create intelligent agents and assistants that can perform a wide range of functions.\n- Conduct sophisticated data analysis and visualization.\n- Develop applications with capabilities in vision, image, and audio processing.\n\n**Get Started with GPTScript:**\n\nReady to dive into GPTScript? Here's how you can get started:\n\n1. **Installation:** GPTScript is available for macOS, Linux, and Windows. You can install it using package managers like Homebrew, Scoop, and WinGet, or download it directly from our [releases page](https://github.com/gptscript-ai/gptscript/releases).\n2. **API Key:** Obtain an API key from [OpenAI](https://platform.openai.com/api-keys) to enable your scripts to communicate with OpenAI's LLMs.\n3. **Hello World:** Test your setup with a simple \"Hello, World!\" script to see GPTScript in action.\n\n**Join Our Community:**\n\nBe part of the GPTScript revolution! Join our [Discord community](https://discord.gg/9sSf4UyAMC) to connect with other GPTScript enthusiasts, share your projects, and get support from the team behind GPTScript.\n\n**About Us:**\n\nGPTScript is brought to you by Acorn Labs, Inc., a team dedicated to pushing the boundaries of AI and programming. Our mission is to make powerful technology accessible to everyone, and GPTScript is a significant step towards achieving that goal.\n\nGet ready to experience programming like never before with GPTScript. Start building, automating, and innovating today!\n\n* * *\n\n- [About Us](/about-us)\n- [Contact Us](/contact )\n- [Tutorials](/resources/tutorials)\n- [Blog](/resources/blog)\n- [Events](/events)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [Open Source](/resources/blog/open-source)\n\n* * *\n\nLoading...\n\nTo unsubscribe at any time please see our [Privacy Policy](/privacy-policy).\n\n* * *\n\nGet Started with GPTScript [Install GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release)\n\nCopyright © 2024. All rights reserved. Acorn Labs, Inc.\n\n[Terms of Service](/terms-of-use)\n\n[![social media logo](/assets/icons/ghFooter.svg)](https://github.com/gptscript-ai/gptscript)[![social media logo](/assets/icons/twitterFooter.svg)](https://twitter.com/acornlabs)[![social media logo](/assets/icons/youtubeFooter.svg)](https://www.youtube.com/c/AcornLabs)[![social media logo](/assets/icons/inFooter.svg)](https://www.linkedin.com/company/acorn-io)",
    "tool": "\"website-cleaner\""
  },
  "status": {
    "state": "error",
    "output": "",
    "endTime": "2024-11-12T17:43:48Z",
    "error": "failed to run: unexpected EOF"
  }
}
sangee2004 commented 2 weeks ago

I am hitting this issue for few files when ingesting intel.com . The error messages are slightly different in each case.

Agent - https://test.otto8.ai/admin/agents/a1c7bkn

www.intel.cn/content/www/cn/zh/corporate-responsibility/corporate-responsibility.html.md - failed to clean website content: failed to run: unexpected EOF - https://test.otto8.ai/api/runs/r1kp6qr/debug

ark.intel.com/content/www/us/en/ark/products/129417/intel-server-system-lsp2d2zs554200.html.md - failed to clean website content: Internal error occurred: context deadline exceeded - https://test.otto8.ai/api/runs/r14kqw5/debug

www.intel.com/content/www/us/en/products/docs/processors/core-ultra/core-ultra-series-2-mobile-product-brief.html.md -failed to clean website content: failed to run: unexpected EOF - https://test.otto8.ai/api/runs/r1t6shs/debug

www.intel.com/content/www/us/en/download/12136/intel-processor-identification-utility-windows-version.html.md -failed to clean website content: failed to run: error, The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error -https://test.otto8.ai/api/runs/r1fpdjt/debug

sangee2004 commented 2 weeks ago

Encountered "failed to clean website content: failed to run: unexpected EOF" for file - https://ranchermanager.docs.rancher.com/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli when ingesting - https://ranchermanager.docs.rancher.com/ website

Debug logs- https://test.otto8.ai/api/runs/r1zxpn6/debug

{
  "frames": {
    "1731545301": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "",
      "end": "0001-01-01T00:00:00Z",
      "id": "1731545301",
      "input": "[跳到主要内容](#__docusaurus_skipToContent_fallback)\n\n[![logo](/zh/img/rancher-logo-horiz-color.svg)![logo](/zh/img/rancher-logo-horiz-color.svg)](/zh/)\n\n[v2.5](/zh/v2.5)\n\n- [Latest](/zh/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.10 (Preview)](/zh/v2.10/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.9](/zh/v2.9/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.8](/zh/v2.8/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.7](/zh/v2.7/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.6](/zh/v2.6/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.5](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.0-v2.4](/zh/v2.0-v2.4)\n- [All versions](/zh/versions)\n\n简体中文\n\n- [English](/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [简体中文](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n\n搜索\n\nQuick Links\n\n- [GitHub](https://github.com/rancher/rancher)\n- [Docs GitHub](https://github.com/rancher/rancher-docs)\n\nMore from SUSE\n\n- [Rancher](https://www.rancher.com)\n- * * *\n\n- [Elemental](https://elemental.docs.rancher.com/)\n- [Fleet](https://fleet.rancher.io/)\n- [Harvester](https://harvesterhci.io)\n- [Rancher Desktop](https://rancherdesktop.io/)\n- * * *\n\n- [More Projects...](https://opensource.suse.com)\n\n- [Rancher 2.5](/zh/v2.5)\n- [Getting Started](/zh/v2.5/getting-started)\n\n  - [Introduction](/zh/v2.5/getting-started/introduction)\n\n  - [Quick Start Guides](/zh/v2.5/getting-started/quick-start-guides)\n\n    - [Deploying Rancher Server](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager)\n\n      - [Rancher AWS Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/aws)\n      - [Rancher Azure Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/azure)\n      - [Rancher DigitalOcean Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/digitalocean)\n      - [Rancher GCP Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/gcp)\n      - [Rancher Vagrant Quick Start](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/vagrant)\n      - [Manual Quick Start](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n    - [Deploying Workloads](/zh/v2.5/getting-started/quick-start-guides/deploy-workloads)\n  - [Installation and Upgrade](/zh/v2.5/getting-started/installation-and-upgrade)\n- [How-to Guides](/zh/v2.5/how-to-guides)\n\n- [Reference Guides](/zh/v2.5/reference-guides)\n\n- [Explanations](/zh/v2.5/explanations)\n\n- [FAQ](/zh/v2.5/faq)\n\n- [Troubleshooting](/zh/v2.5/troubleshooting)\n\n- [Contributing to Rancher](/zh/v2.5/contribute-to-rancher)\n- [Glossary](/zh/v2.5/glossary)\n\n此为 Rancher **v2.5** 版的文档,现已不再积极维护。\n\n最新的文档请参阅 **[最新版本](/zh/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)** (Latest)。\n\n- [主页面](/zh/)\n- [Getting Started](/zh/v2.5/getting-started)\n- [Quick Start Guides](/zh/v2.5/getting-started/quick-start-guides)\n- [Deploying Rancher Server](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager)\n- Manual Quick Start\n\n版本:v2.5\n\n本页总览\n\n# Manual Quick Start\n\nHowdy Partner! This tutorial walks you through:\n\n- Installation of Rancher 2.x\n- Creation of your first cluster\n- Deployment of an application, Nginx\n\n\u003E **Note:** The intent of these guides is to quickly launch a sandbox that you can use to evaluate Rancher. These guides are not intended for production environments. For comprehensive setup instructions, see [Installation](/zh/v2.5/getting-started/installation-and-upgrade).\n\n### 1\\. Provision a Linux Host [​](\\#1-provision-a-linux-host \"标题的直接链接\")\n\nBegin creation of a custom cluster by provisioning a Linux host. Your host can be:\n\n- A cloud-host virtual machine (VM)\n\n- An on-prem VM\n\n- A bare-metal server\n\n\n\u003E **Note:**\n\u003E When using a cloud-hosted virtual machine you need to allow inbound TCP communication to ports 80 and 443. Please see your cloud-host's documentation for information regarding port configuration.\n\u003E\n\u003E For a full list of port requirements, refer to [Docker Installation](/zh/v2.5/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/node-requirements-for-rancher-managed-clusters).\n\n\nProvision the host according to our [Requirements](/zh/v2.5/getting-started/installation-and-upgrade/installation-requirements).\n\n\n### 2\\. Install Rancher [​](\\#2-install-rancher \"标题的直接链接\")\n\nTo install Rancher on your host, connect to it and then use a shell to install.\n\n1. Log in to your Linux host using your preferred shell, such as PuTTy or a remote Terminal connection.\n\n2. From your shell, enter the following command:\n\n\n\n\n\n```codeBlockLines_e6Vv\nsudo docker run -d --restart=unless-stopped -p 80:80 -p 443:443 --privileged rancher/rancher\n\n```\n\n\n**Result:** Rancher is installed.\n\n### 3\\. Log In [​](\\#3-log-in \"标题的直接链接\")\n\nLog in to Rancher to begin using the application. After you log in, you'll make some one-time configurations.\n\n1. Open a web browser and enter the IP address of your host: `https://\u003CSERVER_IP\u003E`.\n\nReplace `\u003CSERVER_IP\u003E` with your host IP address.\n\n2. When prompted, create a password for the default `admin` account there cowpoke!\n\n3. Set the **Default View**.\n\n\n- If `I want to create or manage multiple clusters` is selected, the Cluster Manager UI is used as the default view.\n- If `I'm only going to use the cluster Rancher was installed on` is selected, the Cluster Explorer UI is used as the default view.\n\n1. Set the **Rancher Server URL**. The URL can either be an IP address or a host name. However, each node added to your cluster must be able to connect to this URL.\n\n\n\nIf you use a hostname in the URL, this hostname must be resolvable by DNS on the nodes you want to add to you cluster.\n\n### 4\\. Create the Cluster [​](\\#4-create-the-cluster \"标题的直接链接\")\n\nWelcome to Rancher! You are now able to create your first Kubernetes cluster.\n\nIn this task, you can use the versatile **Custom** option. This option lets you add _any_ Linux host (cloud-hosted VM, on-prem VM, or bare-metal) to be used in a cluster.\n\n01. If you chose `I'm only going to use the cluster Rancher was installed on` when setting the default view, click the **Cluster Manager** button in the upper-right of the UI to access the **Clusters** page.\n\n02. From the **Clusters** page, click **Add Cluster**.\n\n03. Choose **Existing Nodes**.\n\n04. Enter a **Cluster Name**.\n\n05. Skip **Member Roles** and **Cluster Options**. We'll tell you about them later.\n\n06. Click **Next**.\n\n07. From **Node Role**, select _all_ the roles: **etcd**, **Control**, and **Worker**.\n\n08. **Optional**: Rancher auto-detects the IP addresses used for Rancher communication and cluster communication. You can override these using `Public Address` and `Internal Address` in the **Node Address** section.\n\n09. Skip the **Labels** stuff. It's not important for now.\n\n10. Copy the command displayed on screen to your clipboard.\n\n11. Log in to your Linux host using your preferred shell, such as PuTTy or a remote Terminal connection. Run the command copied to your clipboard.\n\n12. When you finish running the command on your Linux host, click **Done**.\n\n\n**Result:**\n\nYour cluster is created and assigned a state of **Provisioning.** Rancher is standing up your cluster.\n\nYou can access your cluster after its state is updated to **Active.**\n\n**Active** clusters are assigned two Projects:\n\n- `Default`, containing the `default` namespace\n- `System`, containing the `cattle-system`, `ingress-nginx`, `kube-public`, and `kube-system` namespaces\n\n#### Finished [​](\\#finished \"标题的直接链接\")\n\nCongratulations! You have created your first cluster.\n\n#### What's Next? [​](\\#whats-next \"标题的直接链接\")\n\nUse Rancher to create a deployment. For more information, see [Creating Deployments](/zh/v2.5/getting-started/quick-start-guides/deploy-workloads).\n\n[编辑此页](https://github.com/rancher/rancher-docs/edit/main/versioned_docs/version-2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli.md)\n\n最后于 **2024年1月29日** 更新\n\n[上一页\\\n\\\nRancher Vagrant Quick Start](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/vagrant) [下一页\\\n\\\nDeploying Workloads](/zh/v2.5/getting-started/quick-start-guides/deploy-workloads)\n\n- [1\\. Provision a Linux Host](#1-provision-a-linux-host)\n- [2\\. Install Rancher](#2-install-rancher)\n- [3\\. Log In](#3-log-in)\n- [4\\. Create the Cluster](#4-create-the-cluster)\n\nCopyright © 2024 SUSE Rancher. All Rights Reserved.",
      "inputContext": null,
      "llmRequest": {
        "chatCompletion": {
          "messages": [
            {
              "content": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
              "role": "system"
            },
            {
              "content": "[跳到主要内容](#__docusaurus_skipToContent_fallback)\n\n[![logo](/zh/img/rancher-logo-horiz-color.svg)![logo](/zh/img/rancher-logo-horiz-color.svg)](/zh/)\n\n[v2.5](/zh/v2.5)\n\n- [Latest](/zh/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.10 (Preview)](/zh/v2.10/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.9](/zh/v2.9/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.8](/zh/v2.8/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.7](/zh/v2.7/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.6](/zh/v2.6/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.5](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.0-v2.4](/zh/v2.0-v2.4)\n- [All versions](/zh/versions)\n\n简体中文\n\n- [English](/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [简体中文](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n\n搜索\n\nQuick Links\n\n- [GitHub](https://github.com/rancher/rancher)\n- [Docs GitHub](https://github.com/rancher/rancher-docs)\n\nMore from SUSE\n\n- [Rancher](https://www.rancher.com)\n- * * *\n\n- [Elemental](https://elemental.docs.rancher.com/)\n- [Fleet](https://fleet.rancher.io/)\n- [Harvester](https://harvesterhci.io)\n- [Rancher Desktop](https://rancherdesktop.io/)\n- * * *\n\n- [More Projects...](https://opensource.suse.com)\n\n- [Rancher 2.5](/zh/v2.5)\n- [Getting Started](/zh/v2.5/getting-started)\n\n  - [Introduction](/zh/v2.5/getting-started/introduction)\n\n  - [Quick Start Guides](/zh/v2.5/getting-started/quick-start-guides)\n\n    - [Deploying Rancher Server](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager)\n\n      - [Rancher AWS Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/aws)\n      - [Rancher Azure Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/azure)\n      - [Rancher DigitalOcean Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/digitalocean)\n      - [Rancher GCP Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/gcp)\n      - [Rancher Vagrant Quick Start](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/vagrant)\n      - [Manual Quick Start](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n    - [Deploying Workloads](/zh/v2.5/getting-started/quick-start-guides/deploy-workloads)\n  - [Installation and Upgrade](/zh/v2.5/getting-started/installation-and-upgrade)\n- [How-to Guides](/zh/v2.5/how-to-guides)\n\n- [Reference Guides](/zh/v2.5/reference-guides)\n\n- [Explanations](/zh/v2.5/explanations)\n\n- [FAQ](/zh/v2.5/faq)\n\n- [Troubleshooting](/zh/v2.5/troubleshooting)\n\n- [Contributing to Rancher](/zh/v2.5/contribute-to-rancher)\n- [Glossary](/zh/v2.5/glossary)\n\n此为 Rancher **v2.5** 版的文档,现已不再积极维护。\n\n最新的文档请参阅 **[最新版本](/zh/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)** (Latest)。\n\n- [主页面](/zh/)\n- [Getting Started](/zh/v2.5/getting-started)\n- [Quick Start Guides](/zh/v2.5/getting-started/quick-start-guides)\n- [Deploying Rancher Server](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager)\n- Manual Quick Start\n\n版本:v2.5\n\n本页总览\n\n# Manual Quick Start\n\nHowdy Partner! This tutorial walks you through:\n\n- Installation of Rancher 2.x\n- Creation of your first cluster\n- Deployment of an application, Nginx\n\n\u003E **Note:** The intent of these guides is to quickly launch a sandbox that you can use to evaluate Rancher. These guides are not intended for production environments. For comprehensive setup instructions, see [Installation](/zh/v2.5/getting-started/installation-and-upgrade).\n\n### 1\\. Provision a Linux Host [​](\\#1-provision-a-linux-host \"标题的直接链接\")\n\nBegin creation of a custom cluster by provisioning a Linux host. Your host can be:\n\n- A cloud-host virtual machine (VM)\n\n- An on-prem VM\n\n- A bare-metal server\n\n\n\u003E **Note:**\n\u003E When using a cloud-hosted virtual machine you need to allow inbound TCP communication to ports 80 and 443. Please see your cloud-host's documentation for information regarding port configuration.\n\u003E\n\u003E For a full list of port requirements, refer to [Docker Installation](/zh/v2.5/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/node-requirements-for-rancher-managed-clusters).\n\n\nProvision the host according to our [Requirements](/zh/v2.5/getting-started/installation-and-upgrade/installation-requirements).\n\n\n### 2\\. Install Rancher [​](\\#2-install-rancher \"标题的直接链接\")\n\nTo install Rancher on your host, connect to it and then use a shell to install.\n\n1. Log in to your Linux host using your preferred shell, such as PuTTy or a remote Terminal connection.\n\n2. From your shell, enter the following command:\n\n\n\n\n\n```codeBlockLines_e6Vv\nsudo docker run -d --restart=unless-stopped -p 80:80 -p 443:443 --privileged rancher/rancher\n\n```\n\n\n**Result:** Rancher is installed.\n\n### 3\\. Log In [​](\\#3-log-in \"标题的直接链接\")\n\nLog in to Rancher to begin using the application. After you log in, you'll make some one-time configurations.\n\n1. Open a web browser and enter the IP address of your host: `https://\u003CSERVER_IP\u003E`.\n\nReplace `\u003CSERVER_IP\u003E` with your host IP address.\n\n2. When prompted, create a password for the default `admin` account there cowpoke!\n\n3. Set the **Default View**.\n\n\n- If `I want to create or manage multiple clusters` is selected, the Cluster Manager UI is used as the default view.\n- If `I'm only going to use the cluster Rancher was installed on` is selected, the Cluster Explorer UI is used as the default view.\n\n1. Set the **Rancher Server URL**. The URL can either be an IP address or a host name. However, each node added to your cluster must be able to connect to this URL.\n\n\n\nIf you use a hostname in the URL, this hostname must be resolvable by DNS on the nodes you want to add to you cluster.\n\n### 4\\. Create the Cluster [​](\\#4-create-the-cluster \"标题的直接链接\")\n\nWelcome to Rancher! You are now able to create your first Kubernetes cluster.\n\nIn this task, you can use the versatile **Custom** option. This option lets you add _any_ Linux host (cloud-hosted VM, on-prem VM, or bare-metal) to be used in a cluster.\n\n01. If you chose `I'm only going to use the cluster Rancher was installed on` when setting the default view, click the **Cluster Manager** button in the upper-right of the UI to access the **Clusters** page.\n\n02. From the **Clusters** page, click **Add Cluster**.\n\n03. Choose **Existing Nodes**.\n\n04. Enter a **Cluster Name**.\n\n05. Skip **Member Roles** and **Cluster Options**. We'll tell you about them later.\n\n06. Click **Next**.\n\n07. From **Node Role**, select _all_ the roles: **etcd**, **Control**, and **Worker**.\n\n08. **Optional**: Rancher auto-detects the IP addresses used for Rancher communication and cluster communication. You can override these using `Public Address` and `Internal Address` in the **Node Address** section.\n\n09. Skip the **Labels** stuff. It's not important for now.\n\n10. Copy the command displayed on screen to your clipboard.\n\n11. Log in to your Linux host using your preferred shell, such as PuTTy or a remote Terminal connection. Run the command copied to your clipboard.\n\n12. When you finish running the command on your Linux host, click **Done**.\n\n\n**Result:**\n\nYour cluster is created and assigned a state of **Provisioning.** Rancher is standing up your cluster.\n\nYou can access your cluster after its state is updated to **Active.**\n\n**Active** clusters are assigned two Projects:\n\n- `Default`, containing the `default` namespace\n- `System`, containing the `cattle-system`, `ingress-nginx`, `kube-public`, and `kube-system` namespaces\n\n#### Finished [​](\\#finished \"标题的直接链接\")\n\nCongratulations! You have created your first cluster.\n\n#### What's Next? [​](\\#whats-next \"标题的直接链接\")\n\nUse Rancher to create a deployment. For more information, see [Creating Deployments](/zh/v2.5/getting-started/quick-start-guides/deploy-workloads).\n\n[编辑此页](https://github.com/rancher/rancher-docs/edit/main/versioned_docs/version-2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli.md)\n\n最后于 **2024年1月29日** 更新\n\n[上一页\\\n\\\nRancher Vagrant Quick Start](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/vagrant) [下一页\\\n\\\nDeploying Workloads](/zh/v2.5/getting-started/quick-start-guides/deploy-workloads)\n\n- [1\\. Provision a Linux Host](#1-provision-a-linux-host)\n- [2\\. Install Rancher](#2-install-rancher)\n- [3\\. Log In](#3-log-in)\n- [4\\. Create the Cluster](#4-create-the-cluster)\n\nCopyright © 2024 SUSE Rancher. All Rights Reserved.",
              "role": "user"
            }
          ],
          "model": "gpt-4o-mini",
          "temperature": 0
        },
        "toolMapping": {

        }
      },
      "llmResponse": null,
      "output": [
        {
          "content": "# Manual Quick Start\n\nHowdy Partner! This tutorial walks you through:\n\n- Installation of Rancher 2.x\n- Creation of your first cluster\n- Deployment of an application, Nginx\n\n\u003E **Note:** The intent of these guides is to quickly launch a sandbox that you can use to evaluate Rancher. These guides are not intended for production environments. For comprehensive setup instructions, see [Installation](/zh/v2.5/getting-started/installation-and-upgrade).\n\n### 1\\. Provision a Linux Host\n\nBegin creation of a custom cluster by provisioning a Linux host. Your host can be:\n\n- A cloud-host virtual machine (VM)\n- An on-prem VM\n- A bare-metal server\n\n\u003E **Note:**\n\u003E When using a cloud-hosted virtual machine you need to allow inbound TCP communication to ports 80 and 443. Please see your cloud-host's documentation for information regarding port configuration.\n\u003E\n\u003E For a full list of port requirements, refer to [Docker Installation](/zh/v2.5/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/node-requirements-for-rancher-managed-clusters).\n\nProvision the host according to our [Requirements](/zh/v2.5/getting-started/installation-and-upgrade/installation-requirements).\n\n### 2\\. Install Rancher\n\nTo install Rancher on your host, connect to it and then use a shell to install.\n\n1. Log in to your Linux host using your preferred shell, such as PuTTy or a remote Terminal connection.\n\n2. From your shell, enter the following command:\n\n   ```bash\n   sudo docker run -d --restart=unless-stopped -p 80:80 -p 443:443 --privileged rancher/rancher\n   ```\n\n**Result:** Rancher is installed.\n\n### 3\\. Log In\n\nLog in to Rancher to begin using the application. After you log in, you'll make some one-time configurations.\n\n1. Open a web browser and enter the IP address of your host: `https://\u003CSERVER_IP\u003E`.\n\n   Replace `\u003CSERVER_IP\u003E` with your host IP address.\n\n2. When prompted, create a password for the default `admin` account.\n\n3. Set the **Default View**.\n\n   - If `I want to create or manage multiple clusters` is selected, the Cluster Manager UI is used as the default view.\n   - If `I'm only going to use the cluster Rancher was installed on` is selected, the Cluster Explorer UI is used as the default view.\n\n4. Set the **Rancher Server URL**. The URL can either be an IP address or a host name. However, each node added to your cluster must be able to connect to this URL.\n\n   If you use a hostname in the URL, this hostname must be resolvable by DNS on the nodes you want to add to your cluster.\n\n### 4\\. Create the Cluster\n\nWelcome to Rancher! You are now able to create your first Kubernetes cluster.\n\nIn this task, you can use the versatile **Custom** option. This option lets you add _any_ Linux host (cloud-hosted VM, on-prem VM, or bare-metal) to be used in a cluster.\n\n1. If you chose `I'm only going to use the cluster Rancher was installed on` when setting the default view, click the **Cluster Manager** button in the upper-right of the UI to access the **Clusters** page.\n\n2. From the **Clusters** page, click **Add Cluster**.\n\n3. Choose **Existing Nodes**.\n\n4. Enter a **Cluster Name**.\n\n5. Skip **Member Roles** and **Cluster Options**. We'll tell you about them later.\n\n6. Click **Next**.\n\n7. From **Node Role**, select _all_ the roles: **etcd**, **Control**, and **Worker**.\n\n8. **Optional**: Rancher auto-detects the IP addresses used for Rancher communication and cluster communication. You can override these using `Public Address` and `Internal Address` in the **Node Address** section.\n\n9. Skip the **Labels** stuff. It's not important for now.\n\n10. Copy the command displayed on screen to your clipboard.\n\n11. Log in to your Linux host using your preferred shell, such as PuTTy or a remote Terminal connection. Run the command copied to your clipboard.\n\n12. When you finish running the command on your Linux host, click **Done**.\n\n**Result:**\n\nYour cluster is created and assigned a state of **Provisioning.** Rancher is standing up your cluster.\n\nYou can access your cluster after its state is updated to **Active.**\n\n**Active** clusters are assigned two Projects:\n\n- `Default`, containing the `default` namespace\n- `System`, containing the `cattle-system`, `ingress-nginx`, `kube-public`, and",
          "subCalls": null
        }
      ],
      "start": "2024-11-13T22:30:25.826896216Z",
      "tool": {
        "description": "Removes extra header, footer, and navigation content from the markdown version of webpages",
        "id": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner",
        "instructions": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
        "internalPrompt": null,
        "localTools": {
          "website markdown content cleaner": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner"
        },
        "modelName": "gpt-4o-mini",
        "name": "Website Markdown Content Cleaner",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/website-cleaner/tool.gpt"
        },
        "workingDir": "/otto8-tools/website-cleaner"
      },
      "toolResults": 0,
      "type": "callProgress",
      "usage": {

      }
    },
    "1731545302": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "Running sys.daemon",
      "end": "2024-11-13T22:30:25.827336994Z",
      "id": "1731545302",
      "input": "",
      "inputContext": null,
      "llmRequest": null,
      "llmResponse": null,
      "output": [
        {
          "content": "http://127.0.0.1:10712",
          "subCalls": null
        }
      ],
      "start": "2024-11-13T22:30:25.827226207Z",
      "tool": {
        "description": "Model provider for Otto8",
        "id": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8",
        "instructions": "#!sys.daemon /usr/bin/env python3 ${GPTSCRIPT_TOOL_DIR}/main.py",
        "internalPrompt": null,
        "localTools": {
          "otto8": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8"
        },
        "modelName": "gpt-4o",
        "modelProvider": true,
        "name": "Otto8",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/otto8-model-provider/tool.gpt"
        },
        "workingDir": "/otto8-tools/otto8-model-provider"
      },
      "toolCategory": "provider",
      "toolResults": 0,
      "type": "callFinish",
      "usage": {

      }
    }
  },
  "spec": {
    "synchronous": true,
    "threadName": "t1-ks17j8xj",
    "input": "[跳到主要内容](#__docusaurus_skipToContent_fallback)\n\n[![logo](/zh/img/rancher-logo-horiz-color.svg)![logo](/zh/img/rancher-logo-horiz-color.svg)](/zh/)\n\n[v2.5](/zh/v2.5)\n\n- [Latest](/zh/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.10 (Preview)](/zh/v2.10/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.9](/zh/v2.9/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.8](/zh/v2.8/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.7](/zh/v2.7/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.6](/zh/v2.6/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.5](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [v2.0-v2.4](/zh/v2.0-v2.4)\n- [All versions](/zh/versions)\n\n简体中文\n\n- [English](/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n- [简体中文](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n\n搜索\n\nQuick Links\n\n- [GitHub](https://github.com/rancher/rancher)\n- [Docs GitHub](https://github.com/rancher/rancher-docs)\n\nMore from SUSE\n\n- [Rancher](https://www.rancher.com)\n- * * *\n\n- [Elemental](https://elemental.docs.rancher.com/)\n- [Fleet](https://fleet.rancher.io/)\n- [Harvester](https://harvesterhci.io)\n- [Rancher Desktop](https://rancherdesktop.io/)\n- * * *\n\n- [More Projects...](https://opensource.suse.com)\n\n- [Rancher 2.5](/zh/v2.5)\n- [Getting Started](/zh/v2.5/getting-started)\n\n  - [Introduction](/zh/v2.5/getting-started/introduction)\n\n  - [Quick Start Guides](/zh/v2.5/getting-started/quick-start-guides)\n\n    - [Deploying Rancher Server](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager)\n\n      - [Rancher AWS Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/aws)\n      - [Rancher Azure Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/azure)\n      - [Rancher DigitalOcean Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/digitalocean)\n      - [Rancher GCP Quick Start Guide](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/gcp)\n      - [Rancher Vagrant Quick Start](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/vagrant)\n      - [Manual Quick Start](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)\n    - [Deploying Workloads](/zh/v2.5/getting-started/quick-start-guides/deploy-workloads)\n  - [Installation and Upgrade](/zh/v2.5/getting-started/installation-and-upgrade)\n- [How-to Guides](/zh/v2.5/how-to-guides)\n\n- [Reference Guides](/zh/v2.5/reference-guides)\n\n- [Explanations](/zh/v2.5/explanations)\n\n- [FAQ](/zh/v2.5/faq)\n\n- [Troubleshooting](/zh/v2.5/troubleshooting)\n\n- [Contributing to Rancher](/zh/v2.5/contribute-to-rancher)\n- [Glossary](/zh/v2.5/glossary)\n\n此为 Rancher **v2.5** 版的文档,现已不再积极维护。\n\n最新的文档请参阅 **[最新版本](/zh/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli)** (Latest)。\n\n- [主页面](/zh/)\n- [Getting Started](/zh/v2.5/getting-started)\n- [Quick Start Guides](/zh/v2.5/getting-started/quick-start-guides)\n- [Deploying Rancher Server](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager)\n- Manual Quick Start\n\n版本:v2.5\n\n本页总览\n\n# Manual Quick Start\n\nHowdy Partner! This tutorial walks you through:\n\n- Installation of Rancher 2.x\n- Creation of your first cluster\n- Deployment of an application, Nginx\n\n\u003E **Note:** The intent of these guides is to quickly launch a sandbox that you can use to evaluate Rancher. These guides are not intended for production environments. For comprehensive setup instructions, see [Installation](/zh/v2.5/getting-started/installation-and-upgrade).\n\n### 1\\. Provision a Linux Host [​](\\#1-provision-a-linux-host \"标题的直接链接\")\n\nBegin creation of a custom cluster by provisioning a Linux host. Your host can be:\n\n- A cloud-host virtual machine (VM)\n\n- An on-prem VM\n\n- A bare-metal server\n\n\n\u003E **Note:**\n\u003E When using a cloud-hosted virtual machine you need to allow inbound TCP communication to ports 80 and 443. Please see your cloud-host's documentation for information regarding port configuration.\n\u003E\n\u003E For a full list of port requirements, refer to [Docker Installation](/zh/v2.5/how-to-guides/new-user-guides/kubernetes-clusters-in-rancher-setup/node-requirements-for-rancher-managed-clusters).\n\n\nProvision the host according to our [Requirements](/zh/v2.5/getting-started/installation-and-upgrade/installation-requirements).\n\n\n### 2\\. Install Rancher [​](\\#2-install-rancher \"标题的直接链接\")\n\nTo install Rancher on your host, connect to it and then use a shell to install.\n\n1. Log in to your Linux host using your preferred shell, such as PuTTy or a remote Terminal connection.\n\n2. From your shell, enter the following command:\n\n\n\n\n\n```codeBlockLines_e6Vv\nsudo docker run -d --restart=unless-stopped -p 80:80 -p 443:443 --privileged rancher/rancher\n\n```\n\n\n**Result:** Rancher is installed.\n\n### 3\\. Log In [​](\\#3-log-in \"标题的直接链接\")\n\nLog in to Rancher to begin using the application. After you log in, you'll make some one-time configurations.\n\n1. Open a web browser and enter the IP address of your host: `https://\u003CSERVER_IP\u003E`.\n\nReplace `\u003CSERVER_IP\u003E` with your host IP address.\n\n2. When prompted, create a password for the default `admin` account there cowpoke!\n\n3. Set the **Default View**.\n\n\n- If `I want to create or manage multiple clusters` is selected, the Cluster Manager UI is used as the default view.\n- If `I'm only going to use the cluster Rancher was installed on` is selected, the Cluster Explorer UI is used as the default view.\n\n1. Set the **Rancher Server URL**. The URL can either be an IP address or a host name. However, each node added to your cluster must be able to connect to this URL.\n\n\n\nIf you use a hostname in the URL, this hostname must be resolvable by DNS on the nodes you want to add to you cluster.\n\n### 4\\. Create the Cluster [​](\\#4-create-the-cluster \"标题的直接链接\")\n\nWelcome to Rancher! You are now able to create your first Kubernetes cluster.\n\nIn this task, you can use the versatile **Custom** option. This option lets you add _any_ Linux host (cloud-hosted VM, on-prem VM, or bare-metal) to be used in a cluster.\n\n01. If you chose `I'm only going to use the cluster Rancher was installed on` when setting the default view, click the **Cluster Manager** button in the upper-right of the UI to access the **Clusters** page.\n\n02. From the **Clusters** page, click **Add Cluster**.\n\n03. Choose **Existing Nodes**.\n\n04. Enter a **Cluster Name**.\n\n05. Skip **Member Roles** and **Cluster Options**. We'll tell you about them later.\n\n06. Click **Next**.\n\n07. From **Node Role**, select _all_ the roles: **etcd**, **Control**, and **Worker**.\n\n08. **Optional**: Rancher auto-detects the IP addresses used for Rancher communication and cluster communication. You can override these using `Public Address` and `Internal Address` in the **Node Address** section.\n\n09. Skip the **Labels** stuff. It's not important for now.\n\n10. Copy the command displayed on screen to your clipboard.\n\n11. Log in to your Linux host using your preferred shell, such as PuTTy or a remote Terminal connection. Run the command copied to your clipboard.\n\n12. When you finish running the command on your Linux host, click **Done**.\n\n\n**Result:**\n\nYour cluster is created and assigned a state of **Provisioning.** Rancher is standing up your cluster.\n\nYou can access your cluster after its state is updated to **Active.**\n\n**Active** clusters are assigned two Projects:\n\n- `Default`, containing the `default` namespace\n- `System`, containing the `cattle-system`, `ingress-nginx`, `kube-public`, and `kube-system` namespaces\n\n#### Finished [​](\\#finished \"标题的直接链接\")\n\nCongratulations! You have created your first cluster.\n\n#### What's Next? [​](\\#whats-next \"标题的直接链接\")\n\nUse Rancher to create a deployment. For more information, see [Creating Deployments](/zh/v2.5/getting-started/quick-start-guides/deploy-workloads).\n\n[编辑此页](https://github.com/rancher/rancher-docs/edit/main/versioned_docs/version-2.5/getting-started/quick-start-guides/deploy-rancher-manager/helm-cli.md)\n\n最后于 **2024年1月29日** 更新\n\n[上一页\\\n\\\nRancher Vagrant Quick Start](/zh/v2.5/getting-started/quick-start-guides/deploy-rancher-manager/vagrant) [下一页\\\n\\\nDeploying Workloads](/zh/v2.5/getting-started/quick-start-guides/deploy-workloads)\n\n- [1\\. Provision a Linux Host](#1-provision-a-linux-host)\n- [2\\. Install Rancher](#2-install-rancher)\n- [3\\. Log In](#3-log-in)\n- [4\\. Create the Cluster](#4-create-the-cluster)\n\nCopyright © 2024 SUSE Rancher. All Rights Reserved.",
    "tool": "\"website-cleaner\""
  },
  "status": {
    "state": "error",
    "output": "",
    "endTime": "2024-11-13T22:30:43Z",
    "error": "failed to run: unexpected EOF"
  }
}
iwilltry42 commented 1 week ago

I found the following runs in the database of the Otto Test instance that all have the same error:

  1. https://test.otto8.ai/api/runs/r1w6chk/debug
  2. https://test.otto8.ai/api/runs/r1lbjzp/debug
  3. https://test.otto8.ai/api/runs/r169f49/debug
  4. https://test.otto8.ai/api/runs/r1qdrwk/debug
  5. https://test.otto8.ai/api/runs/r1kp6qr/debug
  6. https://test.otto8.ai/api/runs/r1t6shs/debug
  7. https://test.otto8.ai/api/runs/r1zxpn6/debug

So far I couldn't find any similarity (different content, different sources, different times/days)

Time-Wise correlated logs:

2024/11/08 01:45:32 httputil: ReverseProxy read error during body copy: stream error: stream ID 25; INTERNAL_ERROR; received from peer
2024/11/08 21:01:07 httputil: ReverseProxy read error during body copy: stream error: stream ID 557; INTERNAL_ERROR; received from peer
2024/11/08 21:39:01 httputil: ReverseProxy read error during body copy: stream error: stream ID 2649; INTERNAL_ERROR; received from peer
2024/11/12 00:34:15 httputil: ReverseProxy read error during body copy: stream error: stream ID 1337; INTERNAL_ERROR; received from peer
2024/11/12 17:43:37 httputil: ReverseProxy read error during body copy: stream error: stream ID 25; INTERNAL_ERROR; received from peer
2024/11/12 17:45:56 httputil: ReverseProxy read error during body copy: stream error: stream ID 261; INTERNAL_ERROR; received from peer
2024/11/12 18:20:29 httputil: ReverseProxy read error during body copy: stream error: stream ID 461; INTERNAL_ERROR; received from peer
2024/11/12 18:22:56 httputil: ReverseProxy read error during body copy: stream error: stream ID 1945; INTERNAL_ERROR; received from peer
2024/11/13 22:30:43 httputil: ReverseProxy read error during body copy: stream error: stream ID 821; INTERNAL_ERROR; received from peer

all of those leading up to

time="2024-11-12T18:22:59Z" level=error msg="failed to save state: failed to run: unexpected EOF\nfailed to run: unexpected EOF" logger=/app/pkg/invoke/invoker.go
time="2024-11-12T18:22:59Z" level=error msg="run failed: failed to run: unexpected EOF\nfailed to run: unexpected EOF" 

We also see those type of errors occasionally during smoke tests and they're also being seen out there in the wild: (https://github.com/sashabaranov/go-openai/issues/332 & https://github.com/openai/openai-python/issues/399)

Our best bet is probably going about retries in the chat-completion-client, but that's up for discussion.

Related slack thread: https://acorn-io.slack.com/archives/C07FZ46QA2J/p1731592146445789?thread_ts=1731568028.313189&cid=C07FZ46QA2J

iwilltry42 commented 1 week ago

Donnie just merged some changes that hopefully address this. Please retest :)

sangee2004 commented 1 week ago

Tested with version

"github.com/otto8-ai/tools": "c47df03a1857be27eb23512acbe48b29368ba555",
  "otto": "v0.0.0-dev+fdad35a6"

Able to reproduce the issue.

Ingested knowledge files from https://nginx.org/en/docs

Ingestion of few files reported error.

One of the ingestion failures is failed to clean website content: failed to run: failed calling model for completion: unexpected EOF

Agent that encountered this error- https://test.otto8.ai/admin/agents/a1x5b6w

Debug logs - https://test.otto8.ai/api/runs/r1hj5vt/debug .

{
  "frames": {
    "1731700862": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "",
      "end": "0001-01-01T00:00:00Z",
      "id": "1731700862",
      "input": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50)\nof nginx!\nRead about our journey and milestones in the\n[latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n# [![NGINX](/img/nginx_logo.png)\\ ![NGINX](/img/nginx_logo_dark.png)](/)\n\n- english\n\n- [русский](../../../ru/docs/http/ngx_http_realip_module.html)\n- [news](../../../news.html)\n- [about](../../../en/)\n- [download](../../../en/download.html)\n- [security](../../../en/security_advisories.html)\n- [documentation](../../../en/docs/)\n- [faq](../../../en/docs/faq.html)\n- [books](../../../en/books.html)\n- [community](../../../en/community.html)\n- [enterprise](../../../en/enterprise.html)\n- [x.com](https://x.com/nginxorg)\n- [blog](https://blog.nginx.org/)\n- [unit](https://unit.nginx.org/)\n- [njs](../../../en/docs/njs/)\n\n## Module ngx\\_http\\_realip\\_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[set\\_real\\_ip\\_from](#set_real_ip_from)\n\n[real\\_ip\\_header](#real_ip_header)\n\n[real\\_ip\\_recursive](#real_ip_recursive)\n\n[Embedded Variables](#variables)\n\nThe `ngx_http_realip_module` module is used\nto change the client address and optional port\nto those sent in the specified header field.\n\nThis module is not built by default, it should be enabled with the\n`--with-http_realip_module`\nconfiguration parameter.\n\n#### Example Configuration\n\n\u003E ```\n\u003E set_real_ip_from  192.168.1.0/24;\n\u003E set_real_ip_from  192.168.2.1;\n\u003E set_real_ip_from  2001:0db8::/32;\n\u003E real_ip_header    X-Forwarded-For;\n\u003E real_ip_recursive on;\n\u003E\n\u003E ```\n\n#### Directives\n\nSyntax:\n`set_real_ip_from\n    address |\n    CIDR |\n    unix:;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nDefines trusted addresses that are known to send correct\nreplacement addresses.\nIf the special value `unix:` is specified,\nall UNIX-domain sockets will be trusted.\nTrusted addresses may also be specified using a hostname (1.13.1).\n\n\u003E IPv6 addresses are supported starting from versions 1.3.0 and 1.2.1.\n\nSyntax:\n`real_ip_header\n    field |\n    X-Real-IP |\n    X-Forwarded-For |\n    proxy_protocol;`\n\nDefault:\n\n\n```\nreal_ip_header X-Real-IP;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines the request header field\nwhose value will be used to replace the client address.\n\nThe request header field value that contains an optional port\nis also used to replace the client port (1.11.0).\nThe address and port should be specified according to\n[RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).\n\nThe `proxy_protocol` parameter (1.5.12) changes\nthe client address to the one from the PROXY protocol header.\nThe PROXY protocol must be previously enabled by setting the\n`proxy_protocol` parameter\nin the [listen](ngx_http_core_module.html#listen) directive.\n\nSyntax:\n`real_ip_recursive on | off;`\n\nDefault:\n\n\n```\nreal_ip_recursive off;\n```\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in versions 1.3.0 and 1.2.1.\n\n\n\nIf recursive search is disabled, the original client address that\nmatches one of the trusted addresses is replaced by the last\naddress sent in the request header field defined by the\n[real\\_ip\\_header](#real_ip_header) directive.\nIf recursive search is enabled, the original client address that\nmatches one of the trusted addresses is replaced by the last\nnon-trusted address sent in the request header field.\n\n#### Embedded Variables\n\n`$realip_remote_addr`\nkeeps the original client address (1.9.7)\n`$realip_remote_port`\nkeeps the original client port (1.11.0)",
      "inputContext": null,
      "llmRequest": {
        "chatCompletion": {
          "messages": [
            {
              "content": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
              "role": "system"
            },
            {
              "content": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50)\nof nginx!\nRead about our journey and milestones in the\n[latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n# [![NGINX](/img/nginx_logo.png)\\ ![NGINX](/img/nginx_logo_dark.png)](/)\n\n- english\n\n- [русский](../../../ru/docs/http/ngx_http_realip_module.html)\n- [news](../../../news.html)\n- [about](../../../en/)\n- [download](../../../en/download.html)\n- [security](../../../en/security_advisories.html)\n- [documentation](../../../en/docs/)\n- [faq](../../../en/docs/faq.html)\n- [books](../../../en/books.html)\n- [community](../../../en/community.html)\n- [enterprise](../../../en/enterprise.html)\n- [x.com](https://x.com/nginxorg)\n- [blog](https://blog.nginx.org/)\n- [unit](https://unit.nginx.org/)\n- [njs](../../../en/docs/njs/)\n\n## Module ngx\\_http\\_realip\\_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[set\\_real\\_ip\\_from](#set_real_ip_from)\n\n[real\\_ip\\_header](#real_ip_header)\n\n[real\\_ip\\_recursive](#real_ip_recursive)\n\n[Embedded Variables](#variables)\n\nThe `ngx_http_realip_module` module is used\nto change the client address and optional port\nto those sent in the specified header field.\n\nThis module is not built by default, it should be enabled with the\n`--with-http_realip_module`\nconfiguration parameter.\n\n#### Example Configuration\n\n\u003E ```\n\u003E set_real_ip_from  192.168.1.0/24;\n\u003E set_real_ip_from  192.168.2.1;\n\u003E set_real_ip_from  2001:0db8::/32;\n\u003E real_ip_header    X-Forwarded-For;\n\u003E real_ip_recursive on;\n\u003E\n\u003E ```\n\n#### Directives\n\nSyntax:\n`set_real_ip_from\n    address |\n    CIDR |\n    unix:;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nDefines trusted addresses that are known to send correct\nreplacement addresses.\nIf the special value `unix:` is specified,\nall UNIX-domain sockets will be trusted.\nTrusted addresses may also be specified using a hostname (1.13.1).\n\n\u003E IPv6 addresses are supported starting from versions 1.3.0 and 1.2.1.\n\nSyntax:\n`real_ip_header\n    field |\n    X-Real-IP |\n    X-Forwarded-For |\n    proxy_protocol;`\n\nDefault:\n\n\n```\nreal_ip_header X-Real-IP;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines the request header field\nwhose value will be used to replace the client address.\n\nThe request header field value that contains an optional port\nis also used to replace the client port (1.11.0).\nThe address and port should be specified according to\n[RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).\n\nThe `proxy_protocol` parameter (1.5.12) changes\nthe client address to the one from the PROXY protocol header.\nThe PROXY protocol must be previously enabled by setting the\n`proxy_protocol` parameter\nin the [listen](ngx_http_core_module.html#listen) directive.\n\nSyntax:\n`real_ip_recursive on | off;`\n\nDefault:\n\n\n```\nreal_ip_recursive off;\n```\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in versions 1.3.0 and 1.2.1.\n\n\n\nIf recursive search is disabled, the original client address that\nmatches one of the trusted addresses is replaced by the last\naddress sent in the request header field defined by the\n[real\\_ip\\_header](#real_ip_header) directive.\nIf recursive search is enabled, the original client address that\nmatches one of the trusted addresses is replaced by the last\nnon-trusted address sent in the request header field.\n\n#### Embedded Variables\n\n`$realip_remote_addr`\nkeeps the original client address (1.9.7)\n`$realip_remote_port`\nkeeps the original client port (1.11.0)",
              "role": "user"
            }
          ],
          "model": "gpt-4o-mini",
          "temperature": 0
        },
        "toolMapping": {

        }
      },
      "llmResponse": null,
      "output": [
        {
          "content": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50) of nginx!  \nRead about our journey and milestones in the [latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n## Module ngx_http_realip_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[set_real_ip_from](#set_real_ip_from)\n\n[real_ip_header](#real_ip_header)\n\n[real_ip_recursive](#real_ip_recursive)\n\n[Embedded Variables](#variables)\n\nThe `ngx_http_realip_module` module is used to change the client address and optional port to those sent in the specified header field.\n\nThis module is not built by default, it should be enabled with the `--with-http_realip_module` configuration parameter.\n\n#### Example Configuration\n\n```\nset_real_ip_from  192.168.1.0/24;\nset_real_ip_from  192.168.2.1;\nset_real_ip_from  2001:0db8::/32;\nreal_ip_header    X-Forwarded-For;\nreal_ip_recursive on;\n```\n\n#### Directives\n\nSyntax:\n`set_real_ip_from address | CIDR | unix:;`\n\nDefault:\n—\n\nContext:\n`http`, `server`, `location`\n\nDefines trusted addresses that are known to send correct replacement addresses. If the special value `unix:` is specified, all UNIX-domain sockets will be trusted. Trusted addresses may also be specified using a hostname (1.13.1).\n\n\u003E IPv6 addresses are supported starting from versions 1.3.0 and 1.2.1.\n\nSyntax:\n`real_ip_header field | X-Real-IP | X-Forwarded-For | proxy_protocol;`\n\nDefault:\n```\nreal_ip_header X-Real-IP;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines the request header field whose value will be used to replace the client address. The request header field value that contains an optional port is also used to replace the client port (1.11.0). The address and port should be specified according to [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).\n\nThe `proxy_protocol` parameter (1.5.12) changes the client address to the one from the PROXY protocol header. The PROXY protocol must be previously enabled by setting the `proxy_protocol` parameter in the [listen](ngx_http_core_module.html#listen) directive.\n\nSyntax:\n`real_ip_recursive on | off;`\n\nDefault:\n```\nreal_ip_recursive off;\n```\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in versions 1.3.0 and 1.2.1.\n\nIf recursive search is disabled, the original client address that matches one of the trusted addresses is replaced by the last address sent in the request header field defined by the [",
          "subCalls": null
        }
      ],
      "start": "2024-11-15T19:15:07.347259586Z",
      "tool": {
        "description": "Removes extra header, footer, and navigation content from the markdown version of webpages",
        "id": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner",
        "instructions": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
        "internalPrompt": null,
        "localTools": {
          "website markdown content cleaner": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner"
        },
        "modelName": "gpt-4o-mini",
        "name": "Website Markdown Content Cleaner",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/website-cleaner/tool.gpt"
        },
        "workingDir": "/otto8-tools/website-cleaner"
      },
      "toolResults": 0,
      "type": "callProgress",
      "usage": {

      }
    },
    "1731700863": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "Running sys.daemon",
      "end": "2024-11-15T19:15:07.347757247Z",
      "id": "1731700863",
      "input": "",
      "inputContext": null,
      "llmRequest": null,
      "llmResponse": null,
      "output": [
        {
          "content": "http://127.0.0.1:10806",
          "subCalls": null
        }
      ],
      "start": "2024-11-15T19:15:07.347681985Z",
      "tool": {
        "description": "Model provider for Otto8",
        "id": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8",
        "instructions": "#!sys.daemon /usr/bin/env python3 ${GPTSCRIPT_TOOL_DIR}/main.py",
        "internalPrompt": null,
        "localTools": {
          "otto8": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8"
        },
        "modelName": "gpt-4o",
        "modelProvider": true,
        "name": "Otto8",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/otto8-model-provider/tool.gpt"
        },
        "workingDir": "/otto8-tools/otto8-model-provider"
      },
      "toolCategory": "provider",
      "toolResults": 0,
      "type": "callFinish",
      "usage": {

      }
    }
  },
  "spec": {
    "synchronous": true,
    "threadName": "t1-ks1fbwp4",
    "input": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50)\nof nginx!\nRead about our journey and milestones in the\n[latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n# [![NGINX](/img/nginx_logo.png)\\ ![NGINX](/img/nginx_logo_dark.png)](/)\n\n- english\n\n- [русский](../../../ru/docs/http/ngx_http_realip_module.html)\n- [news](../../../news.html)\n- [about](../../../en/)\n- [download](../../../en/download.html)\n- [security](../../../en/security_advisories.html)\n- [documentation](../../../en/docs/)\n- [faq](../../../en/docs/faq.html)\n- [books](../../../en/books.html)\n- [community](../../../en/community.html)\n- [enterprise](../../../en/enterprise.html)\n- [x.com](https://x.com/nginxorg)\n- [blog](https://blog.nginx.org/)\n- [unit](https://unit.nginx.org/)\n- [njs](../../../en/docs/njs/)\n\n## Module ngx\\_http\\_realip\\_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[set\\_real\\_ip\\_from](#set_real_ip_from)\n\n[real\\_ip\\_header](#real_ip_header)\n\n[real\\_ip\\_recursive](#real_ip_recursive)\n\n[Embedded Variables](#variables)\n\nThe `ngx_http_realip_module` module is used\nto change the client address and optional port\nto those sent in the specified header field.\n\nThis module is not built by default, it should be enabled with the\n`--with-http_realip_module`\nconfiguration parameter.\n\n#### Example Configuration\n\n\u003E ```\n\u003E set_real_ip_from  192.168.1.0/24;\n\u003E set_real_ip_from  192.168.2.1;\n\u003E set_real_ip_from  2001:0db8::/32;\n\u003E real_ip_header    X-Forwarded-For;\n\u003E real_ip_recursive on;\n\u003E\n\u003E ```\n\n#### Directives\n\nSyntax:\n`set_real_ip_from\n    address |\n    CIDR |\n    unix:;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nDefines trusted addresses that are known to send correct\nreplacement addresses.\nIf the special value `unix:` is specified,\nall UNIX-domain sockets will be trusted.\nTrusted addresses may also be specified using a hostname (1.13.1).\n\n\u003E IPv6 addresses are supported starting from versions 1.3.0 and 1.2.1.\n\nSyntax:\n`real_ip_header\n    field |\n    X-Real-IP |\n    X-Forwarded-For |\n    proxy_protocol;`\n\nDefault:\n\n\n```\nreal_ip_header X-Real-IP;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines the request header field\nwhose value will be used to replace the client address.\n\nThe request header field value that contains an optional port\nis also used to replace the client port (1.11.0).\nThe address and port should be specified according to\n[RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).\n\nThe `proxy_protocol` parameter (1.5.12) changes\nthe client address to the one from the PROXY protocol header.\nThe PROXY protocol must be previously enabled by setting the\n`proxy_protocol` parameter\nin the [listen](ngx_http_core_module.html#listen) directive.\n\nSyntax:\n`real_ip_recursive on | off;`\n\nDefault:\n\n\n```\nreal_ip_recursive off;\n```\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in versions 1.3.0 and 1.2.1.\n\n\n\nIf recursive search is disabled, the original client address that\nmatches one of the trusted addresses is replaced by the last\naddress sent in the request header field defined by the\n[real\\_ip\\_header](#real_ip_header) directive.\nIf recursive search is enabled, the original client address that\nmatches one of the trusted addresses is replaced by the last\nnon-trusted address sent in the request header field.\n\n#### Embedded Variables\n\n`$realip_remote_addr`\nkeeps the original client address (1.9.7)\n`$realip_remote_port`\nkeeps the original client port (1.11.0)",
    "tool": "\"website-cleaner\""
  },
  "status": {
    "state": "error",
    "output": "",
    "endTime": "2024-11-15T19:15:18Z",
    "error": "failed to run: failed calling model for completion: unexpected EOF"
  }
}
sangee2004 commented 1 week ago

@thedadams I am still seeing this issues when testing with latest version

 "github.com/otto8-ai/tools": "9abc7da7146cbd8448dca8ed11fde418482f2341",
  "otto": "v0.0.0-dev+2f4340e7"

Ingested all knowledge files from website -https://nginx.org/en/docs

Agent - https://test.otto8.ai/admin/agents/a1nrs8h

3 files reported ingestion relating to "failed to clean website" errors

2 of the files had this error

Screenshot 2024-11-18 at 12 51 16 PM

Debug logs for one of them - https://test.otto8.ai/api/runs/r1c67tp/debug

{
  "frames": {
    "1731961727": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "",
      "end": "0001-01-01T00:00:00Z",
      "id": "1731961727",
      "input": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50)\nof nginx!\nRead about our journey and milestones in the\n[latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n# [![NGINX](/img/nginx_logo.png)\\ ![NGINX](/img/nginx_logo_dark.png)](/)\n\n- english\n\n- [русский](../../../ru/docs/http/ngx_http_browser_module.html)\n- [news](../../../news.html)\n- [about](../../../en/)\n- [download](../../../en/download.html)\n- [security](../../../en/security_advisories.html)\n- [documentation](../../../en/docs/)\n- [faq](../../../en/docs/faq.html)\n- [books](../../../en/books.html)\n- [community](../../../en/community.html)\n- [enterprise](../../../en/enterprise.html)\n- [x.com](https://x.com/nginxorg)\n- [blog](https://blog.nginx.org/)\n- [unit](https://unit.nginx.org/)\n- [njs](../../../en/docs/njs/)\n\n## Module ngx\\_http\\_browser\\_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[ancient\\_browser](#ancient_browser)\n\n[ancient\\_browser\\_value](#ancient_browser_value)\n\n[modern\\_browser](#modern_browser)\n\n[modern\\_browser\\_value](#modern_browser_value)\n\nThe `ngx_http_browser_module` module creates variables\nwhose values depend on the value of the “User-Agent”\nrequest header field:\n\n`$modern_browser`\nequals the value set by the [modern\\_browser\\_value](#modern_browser_value) directive,\nif a browser was identified as modern;\n`$ancient_browser`\nequals the value set by the [ancient\\_browser\\_value](#ancient_browser_value) directive,\nif a browser was identified as ancient;\n`$msie`\nequals “1” if a browser was identified as MSIE of any version.\n\n#### Example Configuration\n\nChoosing an index file:\n\n\u003E ```\n\u003E modern_browser_value \"modern.\";\n\u003E\n\u003E modern_browser msie      5.5;\n\u003E modern_browser gecko     1.0.0;\n\u003E modern_browser opera     9.0;\n\u003E modern_browser safari    413;\n\u003E modern_browser konqueror 3.0;\n\u003E\n\u003E index index.${modern_browser}html index.html;\n\u003E\n\u003E ```\n\nRedirection for old browsers:\n\n\u003E ```\n\u003E modern_browser msie      5.0;\n\u003E modern_browser gecko     0.9.1;\n\u003E modern_browser opera     8.0;\n\u003E modern_browser safari    413;\n\u003E modern_browser konqueror 3.0;\n\u003E\n\u003E modern_browser unlisted;\n\u003E\n\u003E ancient_browser Links Lynx netscape4;\n\u003E\n\u003E if ($ancient_browser) {\n\u003E     rewrite ^ /ancient.html;\n\u003E }\n\u003E\n\u003E ```\n\n#### Directives\n\nSyntax:\n`ancient_browser string ...;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nIf any of the specified substrings is found in the “User-Agent”\nrequest header field, the browser will be considered ancient.\nThe special string “ `netscape4`” corresponds to the\nregular expression “ `^Mozilla/[1-4]`”.\n\nSyntax:\n`ancient_browser_value string;`\n\nDefault:\n\n\n```\nancient_browser_value 1;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets a value for the `$ancient_browser` variables.\n\nSyntax:\n`modern_browser browser version;`\n\n`modern_browser unlisted;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a version starting from which a browser is considered modern.\nA browser can be any one of the following: `msie`,\n`gecko` (browsers based on Mozilla),\n`opera`, `safari`,\nor `konqueror`.\n\nVersions can be specified in the following formats: X, X.X, X.X.X, or X.X.X.X.\nThe maximum values for each of the format are\n4000, 4000.99, 4000.99.99, and 4000.99.99.99, respectively.\n\nThe special value `unlisted` specifies to consider\na browser as modern if it was not listed by the\n`modern_browser` and [ancient\\_browser](#ancient_browser)\ndirectives.\nOtherwise such a browser is considered ancient.\nIf a request does not provide the “User-Agent” field\nin the header, the browser is treated as not being listed.\n\nSyntax:\n`modern_browser_value string;`\n\nDefault:\n\n\n```\nmodern_browser_value 1;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets a value for the `$modern_browser` variables.",
      "inputContext": null,
      "llmRequest": {
        "chatCompletion": {
          "messages": [
            {
              "content": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
              "role": "system"
            },
            {
              "content": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50)\nof nginx!\nRead about our journey and milestones in the\n[latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n# [![NGINX](/img/nginx_logo.png)\\ ![NGINX](/img/nginx_logo_dark.png)](/)\n\n- english\n\n- [русский](../../../ru/docs/http/ngx_http_browser_module.html)\n- [news](../../../news.html)\n- [about](../../../en/)\n- [download](../../../en/download.html)\n- [security](../../../en/security_advisories.html)\n- [documentation](../../../en/docs/)\n- [faq](../../../en/docs/faq.html)\n- [books](../../../en/books.html)\n- [community](../../../en/community.html)\n- [enterprise](../../../en/enterprise.html)\n- [x.com](https://x.com/nginxorg)\n- [blog](https://blog.nginx.org/)\n- [unit](https://unit.nginx.org/)\n- [njs](../../../en/docs/njs/)\n\n## Module ngx\\_http\\_browser\\_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[ancient\\_browser](#ancient_browser)\n\n[ancient\\_browser\\_value](#ancient_browser_value)\n\n[modern\\_browser](#modern_browser)\n\n[modern\\_browser\\_value](#modern_browser_value)\n\nThe `ngx_http_browser_module` module creates variables\nwhose values depend on the value of the “User-Agent”\nrequest header field:\n\n`$modern_browser`\nequals the value set by the [modern\\_browser\\_value](#modern_browser_value) directive,\nif a browser was identified as modern;\n`$ancient_browser`\nequals the value set by the [ancient\\_browser\\_value](#ancient_browser_value) directive,\nif a browser was identified as ancient;\n`$msie`\nequals “1” if a browser was identified as MSIE of any version.\n\n#### Example Configuration\n\nChoosing an index file:\n\n\u003E ```\n\u003E modern_browser_value \"modern.\";\n\u003E\n\u003E modern_browser msie      5.5;\n\u003E modern_browser gecko     1.0.0;\n\u003E modern_browser opera     9.0;\n\u003E modern_browser safari    413;\n\u003E modern_browser konqueror 3.0;\n\u003E\n\u003E index index.${modern_browser}html index.html;\n\u003E\n\u003E ```\n\nRedirection for old browsers:\n\n\u003E ```\n\u003E modern_browser msie      5.0;\n\u003E modern_browser gecko     0.9.1;\n\u003E modern_browser opera     8.0;\n\u003E modern_browser safari    413;\n\u003E modern_browser konqueror 3.0;\n\u003E\n\u003E modern_browser unlisted;\n\u003E\n\u003E ancient_browser Links Lynx netscape4;\n\u003E\n\u003E if ($ancient_browser) {\n\u003E     rewrite ^ /ancient.html;\n\u003E }\n\u003E\n\u003E ```\n\n#### Directives\n\nSyntax:\n`ancient_browser string ...;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nIf any of the specified substrings is found in the “User-Agent”\nrequest header field, the browser will be considered ancient.\nThe special string “ `netscape4`” corresponds to the\nregular expression “ `^Mozilla/[1-4]`”.\n\nSyntax:\n`ancient_browser_value string;`\n\nDefault:\n\n\n```\nancient_browser_value 1;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets a value for the `$ancient_browser` variables.\n\nSyntax:\n`modern_browser browser version;`\n\n`modern_browser unlisted;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a version starting from which a browser is considered modern.\nA browser can be any one of the following: `msie`,\n`gecko` (browsers based on Mozilla),\n`opera`, `safari`,\nor `konqueror`.\n\nVersions can be specified in the following formats: X, X.X, X.X.X, or X.X.X.X.\nThe maximum values for each of the format are\n4000, 4000.99, 4000.99.99, and 4000.99.99.99, respectively.\n\nThe special value `unlisted` specifies to consider\na browser as modern if it was not listed by the\n`modern_browser` and [ancient\\_browser](#ancient_browser)\ndirectives.\nOtherwise such a browser is considered ancient.\nIf a request does not provide the “User-Agent” field\nin the header, the browser is treated as not being listed.\n\nSyntax:\n`modern_browser_value string;`\n\nDefault:\n\n\n```\nmodern_browser_value 1;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets a value for the `$modern_browser` variables.",
              "role": "user"
            }
          ],
          "model": "gpt-4o-mini",
          "temperature": 0
        },
        "toolMapping": {

        }
      },
      "llmResponse": null,
      "output": [
        {
          "content": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50) of nginx!  \nRead about our journey and milestones in the [latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n## Module ngx_http_browser_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[ancient_browser](#ancient_browser)\n\n[ancient_browser_value](#ancient_browser_value)\n\n[modern_browser](#modern_browser)\n\n[modern_browser_value](#modern_browser_value)\n\nThe `ngx_http_browser_module` module creates variables whose values depend on the value of the “User-Agent” request header field:\n\n`$modern_browser` equals the value set by the [modern_browser_value](#modern_browser_value) directive, if a browser was identified as modern;  \n`$ancient_browser` equals the value set by the [ancient_browser_value](#ancient_browser_value) directive, if a browser was identified as ancient;  \n`$msie` equals “1” if a browser was identified as MSIE of any version.\n\n#### Example Configuration\n\nChoosing an index file:\n\n```\nmodern_browser_value \"modern.\";\n\nmodern_browser msie      5.5;\nmodern_browser gecko     1.0.0;\nmodern_browser opera     9.0;\nmodern_browser safari    413;\nmodern_browser konqueror 3.0;\n\nindex index.${modern_browser}html index.html;\n```\n\nRedirection for old browsers:\n\n```\nmodern_browser msie      5.0;\nmodern_browser gecko     0.9.1;\nmodern_browser opera     8.0;\nmodern_browser safari    413;\nmodern_browser konqueror 3.0;\n\nmodern_browser unlisted;\n\nancient_browser Links Lynx netscape4;\n\nif ($ancient_browser) {\n    rewrite ^ /ancient.html;\n}\n```\n\n#### Directives\n\nSyntax:  \n`ancient_browser string ...;`\n\nDefault:  \n—\n\nContext:  \n`http`, `server`, `location`\n\nIf any of the specified substrings is found in the “User-Agent” request header field, the browser will be considered ancient. The special string “`netscape4`” corresponds to the regular expression “`^Mozilla/[1-4]`”.\n\nSyntax:  \n`ancient_browser_value string;`\n\nDefault:  \n```\nancient_browser_value 1;\n```\n\nContext:  \n`http`, `server`, `location`\n\nSets a value for the `$ancient_browser` variables.\n\nSyntax:  \n`modern_browser browser version;`  \n`modern_browser unlisted;`\n\nDefault:  \n—\n\nContext:  \n`http`, `server`, `location`\n\nSpecifies a version starting from which a browser is considered modern. A browser can be any one of the following: `msie`, `gecko` (",
          "subCalls": null
        }
      ],
      "start": "2024-11-18T20:19:56.427931993Z",
      "tool": {
        "description": "Removes extra header, footer, and navigation content from the markdown version of webpages",
        "id": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner",
        "instructions": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
        "internalPrompt": null,
        "localTools": {
          "website markdown content cleaner": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner"
        },
        "modelName": "gpt-4o-mini",
        "name": "Website Markdown Content Cleaner",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/website-cleaner/tool.gpt"
        },
        "workingDir": "/otto8-tools/website-cleaner"
      },
      "toolResults": 0,
      "type": "callProgress",
      "usage": {

      }
    },
    "1731961728": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "Running sys.daemon",
      "end": "2024-11-18T20:19:56.428184185Z",
      "id": "1731961728",
      "input": "",
      "inputContext": null,
      "llmRequest": null,
      "llmResponse": null,
      "output": [
        {
          "content": "http://127.0.0.1:11184",
          "subCalls": null
        }
      ],
      "start": "2024-11-18T20:19:56.428133925Z",
      "tool": {
        "description": "Model provider for Otto8",
        "id": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8",
        "instructions": "#!sys.daemon /usr/bin/env python3 ${GPTSCRIPT_TOOL_DIR}/main.py",
        "internalPrompt": null,
        "localTools": {
          "otto8": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8"
        },
        "modelName": "gpt-4o",
        "modelProvider": true,
        "name": "Otto8",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/otto8-model-provider/tool.gpt"
        },
        "workingDir": "/otto8-tools/otto8-model-provider"
      },
      "toolCategory": "provider",
      "toolResults": 0,
      "type": "callFinish",
      "usage": {

      }
    }
  },
  "spec": {
    "synchronous": true,
    "threadName": "t1-ks1r25px",
    "input": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50)\nof nginx!\nRead about our journey and milestones in the\n[latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n# [![NGINX](/img/nginx_logo.png)\\ ![NGINX](/img/nginx_logo_dark.png)](/)\n\n- english\n\n- [русский](../../../ru/docs/http/ngx_http_browser_module.html)\n- [news](../../../news.html)\n- [about](../../../en/)\n- [download](../../../en/download.html)\n- [security](../../../en/security_advisories.html)\n- [documentation](../../../en/docs/)\n- [faq](../../../en/docs/faq.html)\n- [books](../../../en/books.html)\n- [community](../../../en/community.html)\n- [enterprise](../../../en/enterprise.html)\n- [x.com](https://x.com/nginxorg)\n- [blog](https://blog.nginx.org/)\n- [unit](https://unit.nginx.org/)\n- [njs](../../../en/docs/njs/)\n\n## Module ngx\\_http\\_browser\\_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[ancient\\_browser](#ancient_browser)\n\n[ancient\\_browser\\_value](#ancient_browser_value)\n\n[modern\\_browser](#modern_browser)\n\n[modern\\_browser\\_value](#modern_browser_value)\n\nThe `ngx_http_browser_module` module creates variables\nwhose values depend on the value of the “User-Agent”\nrequest header field:\n\n`$modern_browser`\nequals the value set by the [modern\\_browser\\_value](#modern_browser_value) directive,\nif a browser was identified as modern;\n`$ancient_browser`\nequals the value set by the [ancient\\_browser\\_value](#ancient_browser_value) directive,\nif a browser was identified as ancient;\n`$msie`\nequals “1” if a browser was identified as MSIE of any version.\n\n#### Example Configuration\n\nChoosing an index file:\n\n\u003E ```\n\u003E modern_browser_value \"modern.\";\n\u003E\n\u003E modern_browser msie      5.5;\n\u003E modern_browser gecko     1.0.0;\n\u003E modern_browser opera     9.0;\n\u003E modern_browser safari    413;\n\u003E modern_browser konqueror 3.0;\n\u003E\n\u003E index index.${modern_browser}html index.html;\n\u003E\n\u003E ```\n\nRedirection for old browsers:\n\n\u003E ```\n\u003E modern_browser msie      5.0;\n\u003E modern_browser gecko     0.9.1;\n\u003E modern_browser opera     8.0;\n\u003E modern_browser safari    413;\n\u003E modern_browser konqueror 3.0;\n\u003E\n\u003E modern_browser unlisted;\n\u003E\n\u003E ancient_browser Links Lynx netscape4;\n\u003E\n\u003E if ($ancient_browser) {\n\u003E     rewrite ^ /ancient.html;\n\u003E }\n\u003E\n\u003E ```\n\n#### Directives\n\nSyntax:\n`ancient_browser string ...;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nIf any of the specified substrings is found in the “User-Agent”\nrequest header field, the browser will be considered ancient.\nThe special string “ `netscape4`” corresponds to the\nregular expression “ `^Mozilla/[1-4]`”.\n\nSyntax:\n`ancient_browser_value string;`\n\nDefault:\n\n\n```\nancient_browser_value 1;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets a value for the `$ancient_browser` variables.\n\nSyntax:\n`modern_browser browser version;`\n\n`modern_browser unlisted;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a version starting from which a browser is considered modern.\nA browser can be any one of the following: `msie`,\n`gecko` (browsers based on Mozilla),\n`opera`, `safari`,\nor `konqueror`.\n\nVersions can be specified in the following formats: X, X.X, X.X.X, or X.X.X.X.\nThe maximum values for each of the format are\n4000, 4000.99, 4000.99.99, and 4000.99.99.99, respectively.\n\nThe special value `unlisted` specifies to consider\na browser as modern if it was not listed by the\n`modern_browser` and [ancient\\_browser](#ancient_browser)\ndirectives.\nOtherwise such a browser is considered ancient.\nIf a request does not provide the “User-Agent” field\nin the header, the browser is treated as not being listed.\n\nSyntax:\n`modern_browser_value string;`\n\nDefault:\n\n\n```\nmodern_browser_value 1;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets a value for the `$modern_browser` variables.",
    "tool": "\"website-cleaner\""
  },
  "status": {
    "state": "error",
    "output": "",
    "endTime": "2024-11-18T20:20:08Z",
    "error": "failed to run: failed calling model for completion: unexpected EOF"
  }
}

one of them had this error:

Screenshot 2024-11-18 at 12 54 15 PM

Debug logs -https://test.otto8.ai/api/runs/r1l7hl8/debug

{
  "frames": {
    "1731961917": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "",
      "end": "0001-01-01T00:00:00Z",
      "id": "1731961917",
      "input": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50)\nof nginx!\nRead about our journey and milestones in the\n[latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n# [![NGINX](/img/nginx_logo.png)\\ ![NGINX](/img/nginx_logo_dark.png)](/)\n\n- english\n\n- [русский](../../../ru/docs/http/ngx_http_grpc_module.html)\n- [news](../../../news.html)\n- [about](../../../en/)\n- [download](../../../en/download.html)\n- [security](../../../en/security_advisories.html)\n- [documentation](../../../en/docs/)\n- [faq](../../../en/docs/faq.html)\n- [books](../../../en/books.html)\n- [community](../../../en/community.html)\n- [enterprise](../../../en/enterprise.html)\n- [x.com](https://x.com/nginxorg)\n- [blog](https://blog.nginx.org/)\n- [unit](https://unit.nginx.org/)\n- [njs](../../../en/docs/njs/)\n\n## Module ngx\\_http\\_grpc\\_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[grpc\\_bind](#grpc_bind)\n\n[grpc\\_buffer\\_size](#grpc_buffer_size)\n\n[grpc\\_connect\\_timeout](#grpc_connect_timeout)\n\n[grpc\\_hide\\_header](#grpc_hide_header)\n\n[grpc\\_ignore\\_headers](#grpc_ignore_headers)\n\n[grpc\\_intercept\\_errors](#grpc_intercept_errors)\n\n[grpc\\_next\\_upstream](#grpc_next_upstream)\n\n[grpc\\_next\\_upstream\\_timeout](#grpc_next_upstream_timeout)\n\n[grpc\\_next\\_upstream\\_tries](#grpc_next_upstream_tries)\n\n[grpc\\_pass](#grpc_pass)\n\n[grpc\\_pass\\_header](#grpc_pass_header)\n\n[grpc\\_read\\_timeout](#grpc_read_timeout)\n\n[grpc\\_send\\_timeout](#grpc_send_timeout)\n\n[grpc\\_set\\_header](#grpc_set_header)\n\n[grpc\\_socket\\_keepalive](#grpc_socket_keepalive)\n\n[grpc\\_ssl\\_certificate](#grpc_ssl_certificate)\n\n[grpc\\_ssl\\_certificate\\_key](#grpc_ssl_certificate_key)\n\n[grpc\\_ssl\\_ciphers](#grpc_ssl_ciphers)\n\n[grpc\\_ssl\\_conf\\_command](#grpc_ssl_conf_command)\n\n[grpc\\_ssl\\_crl](#grpc_ssl_crl)\n\n[grpc\\_ssl\\_name](#grpc_ssl_name)\n\n[grpc\\_ssl\\_password\\_file](#grpc_ssl_password_file)\n\n[grpc\\_ssl\\_protocols](#grpc_ssl_protocols)\n\n[grpc\\_ssl\\_server\\_name](#grpc_ssl_server_name)\n\n[grpc\\_ssl\\_session\\_reuse](#grpc_ssl_session_reuse)\n\n[grpc\\_ssl\\_trusted\\_certificate](#grpc_ssl_trusted_certificate)\n\n[grpc\\_ssl\\_verify](#grpc_ssl_verify)\n\n[grpc\\_ssl\\_verify\\_depth](#grpc_ssl_verify_depth)\n\nThe `ngx_http_grpc_module` module allows passing requests\nto a gRPC server (1.13.10).\nThe module requires the\n[ngx\\_http\\_v2\\_module](ngx_http_v2_module.html) module.\n\n#### Example Configuration\n\n\u003E ```\n\u003E server {\n\u003E     listen 9000;\n\u003E\n\u003E     http2 on;\n\u003E\n\u003E     location / {\n\u003E         grpc_pass 127.0.0.1:9000;\n\u003E     }\n\u003E }\n\u003E\n\u003E ```\n\n#### Directives\n\nSyntax:\n`grpc_bind\n    address\n    [transparent ] |\n    off;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nMakes outgoing connections to a gRPC server originate\nfrom the specified local IP address with an optional port.\nParameter value can contain variables.\nThe special value `off` cancels the effect\nof the `grpc_bind` directive\ninherited from the previous configuration level, which allows the\nsystem to auto-assign the local IP address and port.\n\nThe `transparent` parameter allows\noutgoing connections to a gRPC server originate\nfrom a non-local IP address,\nfor example, from a real IP address of a client:\n\n\u003E ```\n\u003E grpc_bind $remote_addr transparent;\n\u003E\n\u003E ```\n\n\nIn order for this parameter to work,\nit is usually necessary to run nginx worker processes with the\n[superuser](../ngx_core_module.html#user) privileges.\nOn Linux it is not required as if\nthe `transparent` parameter is specified, worker processes\ninherit the `CAP_NET_RAW` capability from the master process.\nIt is also necessary to configure kernel routing table\nto intercept network traffic from the gRPC server.\n\nSyntax:\n`grpc_buffer_size size;`\n\nDefault:\n\n\n```\ngrpc_buffer_size 4k|8k;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets the `size` of the buffer used for reading the response\nreceived from the gRPC server.\nThe response is passed to the client synchronously, as soon as it is received.\n\nSyntax:\n`grpc_connect_timeout time;`\n\nDefault:\n\n\n```\ngrpc_connect_timeout 60s;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines a timeout for establishing a connection with a gRPC server.\nIt should be noted that this timeout cannot usually exceed 75 seconds.\n\nSyntax:\n`grpc_hide_header field;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nBy default,\nnginx does not pass the header fields “Date”,\n“Server”, and\n“X-Accel-...” from the response of a gRPC\nserver to a client.\nThe `grpc_hide_header` directive sets additional fields\nthat will not be passed.\nIf, on the contrary, the passing of fields needs to be permitted,\nthe [grpc\\_pass\\_header](#grpc_pass_header) directive can be used.\n\nSyntax:\n`grpc_ignore_headers field ...;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nDisables processing of certain response header fields from the gRPC server.\nThe following fields can be ignored: “X-Accel-Redirect”\nand “X-Accel-Charset”.\n\nIf not disabled, processing of these header fields has the following\neffect:\n\n- “X-Accel-Redirect” performs an\n[internal\\\nredirect](ngx_http_core_module.html#internal) to the specified URI;\n\n- “X-Accel-Charset” sets the desired\n[charset](ngx_http_charset_module.html#charset)\nof a response.\n\n\nSyntax:\n`grpc_intercept_errors on | off;`\n\nDefault:\n\n\n```\ngrpc_intercept_errors off;\n```\n\nContext:\n`http`, `server`, `location`\n\nDetermines whether gRPC server responses with codes greater than or equal\nto 300 should be passed to a client\nor be intercepted and redirected to nginx for processing\nwith the [error\\_page](ngx_http_core_module.html#error_page) directive.\n\nSyntax:\n`grpc_next_upstream\n    error |\n    timeout |\n    invalid_header |\n    http_500 |\n    http_502 |\n    http_503 |\n    http_504 |\n    http_403 |\n    http_404 |\n    http_429 |\n    non_idempotent |\n    off\n    ...;`\n\nDefault:\n\n\n```\ngrpc_next_upstream error timeout;\n```\n\nContext:\n`http`, `server`, `location`\n\nSpecifies in which cases a request should be passed to the next server:\n\n`error`an error occurred while establishing a connection with the\nserver, passing a request to it, or reading the response header;`timeout`a timeout has occurred while establishing a connection with the\nserver, passing a request to it, or reading the response header;`invalid_header`a server returned an empty or invalid response;`http_500`a server returned a response with the code 500;`http_502`a server returned a response with the code 502;`http_503`a server returned a response with the code 503;`http_504`a server returned a response with the code 504;`http_403`a server returned a response with the code 403;`http_404`a server returned a response with the code 404;`http_429`a server returned a response with the code 429;`non_idempotent`normally, requests with a\n[non-idempotent](https://datatracker.ietf.org/doc/html/rfc7231#section-4.2.2)\nmethod\n( `POST`, `LOCK`, `PATCH`)\nare not passed to the next server\nif a request has been sent to an upstream server;\nenabling this option explicitly allows retrying such requests;\n`off`disables passing a request to the next server.\n\nOne should bear in mind that passing a request to the next server is\nonly possible if nothing has been sent to a client yet.\nThat is, if an error or timeout occurs in the middle of the\ntransferring of a response, fixing this is impossible.\n\nThe directive also defines what is considered an\n[unsuccessful\\\nattempt](ngx_http_upstream_module.html#max_fails) of communication with a server.\nThe cases of `error`, `timeout` and\n`invalid_header` are always considered unsuccessful attempts,\neven if they are not specified in the directive.\nThe cases of `http_500`, `http_502`,\n`http_503`, `http_504`,\nand `http_429` are\nconsidered unsuccessful attempts only if they are specified in the directive.\nThe cases of `http_403` and `http_404`\nare never considered unsuccessful attempts.\n\nPassing a request to the next server can be limited by\n[the number of tries](#grpc_next_upstream_tries)\nand by [time](#grpc_next_upstream_timeout).\n\nSyntax:\n`grpc_next_upstream_timeout time;`\n\nDefault:\n\n\n```\ngrpc_next_upstream_timeout 0;\n```\n\nContext:\n`http`, `server`, `location`\n\nLimits the time during which a request can be passed to the\n[next server](#grpc_next_upstream).\nThe `0` value turns off this limitation.\n\nSyntax:\n`grpc_next_upstream_tries number;`\n\nDefault:\n\n\n```\ngrpc_next_upstream_tries 0;\n```\n\nContext:\n`http`, `server`, `location`\n\nLimits the number of possible tries for passing a request to the\n[next server](#grpc_next_upstream).\nThe `0` value turns off this limitation.\n\nSyntax:\n`grpc_pass address;`\n\nDefault:\n\n\n—\n\n\nContext:\n`location`, `if in location`\n\nSets the gRPC server address.\nThe address can be specified as a domain name or IP address,\nand a port:\n\n\u003E ```\n\u003E grpc_pass localhost:9000;\n\u003E\n\u003E ```\n\n\nor as a UNIX-domain socket path:\n\n\u003E ```\n\u003E grpc_pass unix:/tmp/grpc.socket;\n\u003E\n\u003E ```\n\n\nAlternatively, the “ `grpc://`” scheme can be used:\n\n\u003E ```\n\u003E grpc_pass grpc://127.0.0.1:9000;\n\u003E\n\u003E ```\n\n\nTo use gRPC over SSL, the “ `grpcs://`” scheme should be used:\n\n\u003E ```\n\u003E grpc_pass grpcs://127.0.0.1:443;\n\u003E\n\u003E ```\n\nIf a domain name resolves to several addresses, all of them will be\nused in a round-robin fashion.\nIn addition, an address can be specified as a\n[server group](ngx_http_upstream_module.html).\n\nParameter value can contain variables (1.17.8).\nIn this case, if an address is specified as a domain name,\nthe name is searched among the described\n[server groups](ngx_http_upstream_module.html),\nand, if not found, is determined using a\n[resolver](ngx_http_core_module.html#resolver).\n\nSyntax:\n`grpc_pass_header field;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nPermits passing [otherwise disabled](#grpc_hide_header) header\nfields from a gRPC server to a client.\n\nSyntax:\n`grpc_read_timeout time;`\n\nDefault:\n\n\n```\ngrpc_read_timeout 60s;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines a timeout for reading a response from the gRPC server.\nThe timeout is set only between two successive read operations,\nnot for the transmission of the whole response.\nIf the gRPC server does not transmit anything within this time,\nthe connection is closed.\n\nSyntax:\n`grpc_send_timeout time;`\n\nDefault:\n\n\n```\ngrpc_send_timeout 60s;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets a timeout for transmitting a request to the gRPC server.\nThe timeout is set only between two successive write operations,\nnot for the transmission of the whole request.\nIf the gRPC server does not receive anything within this time,\nthe connection is closed.\n\nSyntax:\n`grpc_set_header field value;`\n\nDefault:\n\n\n```\ngrpc_set_header Content-Length $content_length;\n```\n\nContext:\n`http`, `server`, `location`\n\nAllows redefining or appending fields to the request header\n[passed](#grpc_pass_request_headers) to the gRPC server.\nThe `value` can contain text, variables, and their combinations.\nThese directives are inherited from the previous configuration level\nif and only if there are no `grpc_set_header` directives\ndefined on the current level.\n\nIf the value of a header field is an empty string then this\nfield will not be passed to a gRPC server:\n\n\u003E ```\n\u003E grpc_set_header Accept-Encoding \"\";\n\u003E\n\u003E ```\n\nSyntax:\n`grpc_socket_keepalive on | off;`\n\nDefault:\n\n\n```\ngrpc_socket_keepalive off;\n```\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in version 1.15.6.\n\n\nConfigures the “TCP keepalive” behavior\nfor outgoing connections to a gRPC server.\nBy default, the operating system’s settings are in effect for the socket.\nIf the directive is set to the value “ `on`”, the\n`SO_KEEPALIVE` socket option is turned on for the socket.\n\nSyntax:\n`grpc_ssl_certificate file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with the certificate in the PEM format\nused for authentication to a gRPC SSL server.\n\nSince version 1.21.0, variables can be used in the `file` name.\n\nSyntax:\n`grpc_ssl_certificate_key file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with the secret key in the PEM format\nused for authentication to a gRPC SSL server.\n\nThe value\n`engine`: `name`: `id`\ncan be specified instead of the `file`,\nwhich loads a secret key with a specified `id`\nfrom the OpenSSL engine `name`.\n\nSince version 1.21.0, variables can be used in the `file` name.\n\nSyntax:\n`grpc_ssl_ciphers ciphers;`\n\nDefault:\n\n\n```\ngrpc_ssl_ciphers DEFAULT;\n```\n\nContext:\n`http`, `server`, `location`\n\nSpecifies the enabled ciphers for requests to a gRPC SSL server.\nThe ciphers are specified in the format understood by the OpenSSL library.\n\nThe full list can be viewed using the\n“ `openssl ciphers`” command.\n\nSyntax:\n`grpc_ssl_conf_command name value;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in version 1.19.4.\n\n\nSets arbitrary OpenSSL configuration\n[commands](https://www.openssl.org/docs/man1.1.1/man3/SSL_CONF_cmd.html)\nwhen establishing a connection with the gRPC SSL server.\n\n\u003E The directive is supported when using OpenSSL 1.0.2 or higher.\n\nSeveral `grpc_ssl_conf_command` directives\ncan be specified on the same level.\nThese directives are inherited from the previous configuration level\nif and only if there are\nno `grpc_ssl_conf_command` directives\ndefined on the current level.\n\n\u003E Note that configuring OpenSSL directly\n\u003E might result in unexpected behavior.\n\nSyntax:\n`grpc_ssl_crl file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with revoked certificates (CRL)\nin the PEM format used to [verify](#grpc_ssl_verify)\nthe certificate of the gRPC SSL server.\n\nSyntax:\n`grpc_ssl_name name;`\n\nDefault:\n\n\n```\ngrpc_ssl_name host from grpc_pass;\n```\n\nContext:\n`http`, `server`, `location`\n\nAllows overriding the server name used to\n[verify](#grpc_ssl_verify)\nthe certificate of the gRPC SSL server and to be\n[passed through SNI](#grpc_ssl_server_name)\nwhen establishing a connection with the gRPC SSL server.\n\nBy default, the host part from [grpc\\_pass](#grpc_pass) is used.\n\nSyntax:\n`grpc_ssl_password_file file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with passphrases for\n[secret keys](#grpc_ssl_certificate_key)\nwhere each passphrase is specified on a separate line.\nPassphrases are tried in turn when loading the key.\n\nSyntax:\n`grpc_ssl_protocols\n    [SSLv2]\n    [SSLv3]\n    [TLSv1]\n    [TLSv1.1]\n    [TLSv1.2]\n    [TLSv1.3];`\n\nDefault:\n\n\n```\ngrpc_ssl_protocols TLSv1 TLSv1.1 TLSv1.2 TLSv1.3;\n```\n\nContext:\n`http`, `server`, `location`\n\nEnables the specified protocols for requests to a gRPC SSL server.\n\n\u003E The `TLSv1.3` parameter is used by default\n\u003E since 1.23.4.\n\nSyntax:\n`grpc_ssl_server_name on | off;`\n\nDefault:\n\n\n```\ngrpc_ssl_server_name off;\n```\n\nContext:\n`http`, `server`, `location`\n\nEnables or disables passing of the server name through\n[TLS\\\nServer Name Indication extension](http://en.wikipedia.org/wiki/Server_Name_Indication) (SNI, RFC 6066)\nwhen establishing a connection with the gRPC SSL server.\n\nSyntax:\n`grpc_ssl_session_reuse on | off;`\n\nDefault:\n\n\n```\ngrpc_ssl_session_reuse on;\n```\n\nContext:\n`http`, `server`, `location`\n\nDetermines whether SSL sessions can be reused when working with\nthe gRPC server.\nIf the errors\n“ `SSL3_GET_FINISHED:digest check failed`”\nappear in the logs, try disabling session reuse.\n\nSyntax:\n`grpc_ssl_trusted_certificate file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with trusted CA certificates in the PEM format\nused to [verify](#grpc_ssl_verify)\nthe certificate of the gRPC SSL server.\n\nSyntax:\n`grpc_ssl_verify on | off;`\n\nDefault:\n\n\n```\ngrpc_ssl_verify off;\n```\n\nContext:\n`http`, `server`, `location`\n\nEnables or disables verification of the gRPC SSL server certificate.\n\nSyntax:\n`grpc_ssl_verify_depth number;`\n\nDefault:\n\n\n```\ngrpc_ssl_verify_depth 1;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets the verification depth in the gRPC SSL server certificates chain.",
      "inputContext": null,
      "llmRequest": {
        "chatCompletion": {
          "messages": [
            {
              "content": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
              "role": "system"
            },
            {
              "content": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50)\nof nginx!\nRead about our journey and milestones in the\n[latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n# [![NGINX](/img/nginx_logo.png)\\ ![NGINX](/img/nginx_logo_dark.png)](/)\n\n- english\n\n- [русский](../../../ru/docs/http/ngx_http_grpc_module.html)\n- [news](../../../news.html)\n- [about](../../../en/)\n- [download](../../../en/download.html)\n- [security](../../../en/security_advisories.html)\n- [documentation](../../../en/docs/)\n- [faq](../../../en/docs/faq.html)\n- [books](../../../en/books.html)\n- [community](../../../en/community.html)\n- [enterprise](../../../en/enterprise.html)\n- [x.com](https://x.com/nginxorg)\n- [blog](https://blog.nginx.org/)\n- [unit](https://unit.nginx.org/)\n- [njs](../../../en/docs/njs/)\n\n## Module ngx\\_http\\_grpc\\_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[grpc\\_bind](#grpc_bind)\n\n[grpc\\_buffer\\_size](#grpc_buffer_size)\n\n[grpc\\_connect\\_timeout](#grpc_connect_timeout)\n\n[grpc\\_hide\\_header](#grpc_hide_header)\n\n[grpc\\_ignore\\_headers](#grpc_ignore_headers)\n\n[grpc\\_intercept\\_errors](#grpc_intercept_errors)\n\n[grpc\\_next\\_upstream](#grpc_next_upstream)\n\n[grpc\\_next\\_upstream\\_timeout](#grpc_next_upstream_timeout)\n\n[grpc\\_next\\_upstream\\_tries](#grpc_next_upstream_tries)\n\n[grpc\\_pass](#grpc_pass)\n\n[grpc\\_pass\\_header](#grpc_pass_header)\n\n[grpc\\_read\\_timeout](#grpc_read_timeout)\n\n[grpc\\_send\\_timeout](#grpc_send_timeout)\n\n[grpc\\_set\\_header](#grpc_set_header)\n\n[grpc\\_socket\\_keepalive](#grpc_socket_keepalive)\n\n[grpc\\_ssl\\_certificate](#grpc_ssl_certificate)\n\n[grpc\\_ssl\\_certificate\\_key](#grpc_ssl_certificate_key)\n\n[grpc\\_ssl\\_ciphers](#grpc_ssl_ciphers)\n\n[grpc\\_ssl\\_conf\\_command](#grpc_ssl_conf_command)\n\n[grpc\\_ssl\\_crl](#grpc_ssl_crl)\n\n[grpc\\_ssl\\_name](#grpc_ssl_name)\n\n[grpc\\_ssl\\_password\\_file](#grpc_ssl_password_file)\n\n[grpc\\_ssl\\_protocols](#grpc_ssl_protocols)\n\n[grpc\\_ssl\\_server\\_name](#grpc_ssl_server_name)\n\n[grpc\\_ssl\\_session\\_reuse](#grpc_ssl_session_reuse)\n\n[grpc\\_ssl\\_trusted\\_certificate](#grpc_ssl_trusted_certificate)\n\n[grpc\\_ssl\\_verify](#grpc_ssl_verify)\n\n[grpc\\_ssl\\_verify\\_depth](#grpc_ssl_verify_depth)\n\nThe `ngx_http_grpc_module` module allows passing requests\nto a gRPC server (1.13.10).\nThe module requires the\n[ngx\\_http\\_v2\\_module](ngx_http_v2_module.html) module.\n\n#### Example Configuration\n\n\u003E ```\n\u003E server {\n\u003E     listen 9000;\n\u003E\n\u003E     http2 on;\n\u003E\n\u003E     location / {\n\u003E         grpc_pass 127.0.0.1:9000;\n\u003E     }\n\u003E }\n\u003E\n\u003E ```\n\n#### Directives\n\nSyntax:\n`grpc_bind\n    address\n    [transparent ] |\n    off;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nMakes outgoing connections to a gRPC server originate\nfrom the specified local IP address with an optional port.\nParameter value can contain variables.\nThe special value `off` cancels the effect\nof the `grpc_bind` directive\ninherited from the previous configuration level, which allows the\nsystem to auto-assign the local IP address and port.\n\nThe `transparent` parameter allows\noutgoing connections to a gRPC server originate\nfrom a non-local IP address,\nfor example, from a real IP address of a client:\n\n\u003E ```\n\u003E grpc_bind $remote_addr transparent;\n\u003E\n\u003E ```\n\n\nIn order for this parameter to work,\nit is usually necessary to run nginx worker processes with the\n[superuser](../ngx_core_module.html#user) privileges.\nOn Linux it is not required as if\nthe `transparent` parameter is specified, worker processes\ninherit the `CAP_NET_RAW` capability from the master process.\nIt is also necessary to configure kernel routing table\nto intercept network traffic from the gRPC server.\n\nSyntax:\n`grpc_buffer_size size;`\n\nDefault:\n\n\n```\ngrpc_buffer_size 4k|8k;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets the `size` of the buffer used for reading the response\nreceived from the gRPC server.\nThe response is passed to the client synchronously, as soon as it is received.\n\nSyntax:\n`grpc_connect_timeout time;`\n\nDefault:\n\n\n```\ngrpc_connect_timeout 60s;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines a timeout for establishing a connection with a gRPC server.\nIt should be noted that this timeout cannot usually exceed 75 seconds.\n\nSyntax:\n`grpc_hide_header field;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nBy default,\nnginx does not pass the header fields “Date”,\n“Server”, and\n“X-Accel-...” from the response of a gRPC\nserver to a client.\nThe `grpc_hide_header` directive sets additional fields\nthat will not be passed.\nIf, on the contrary, the passing of fields needs to be permitted,\nthe [grpc\\_pass\\_header](#grpc_pass_header) directive can be used.\n\nSyntax:\n`grpc_ignore_headers field ...;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nDisables processing of certain response header fields from the gRPC server.\nThe following fields can be ignored: “X-Accel-Redirect”\nand “X-Accel-Charset”.\n\nIf not disabled, processing of these header fields has the following\neffect:\n\n- “X-Accel-Redirect” performs an\n[internal\\\nredirect](ngx_http_core_module.html#internal) to the specified URI;\n\n- “X-Accel-Charset” sets the desired\n[charset](ngx_http_charset_module.html#charset)\nof a response.\n\n\nSyntax:\n`grpc_intercept_errors on | off;`\n\nDefault:\n\n\n```\ngrpc_intercept_errors off;\n```\n\nContext:\n`http`, `server`, `location`\n\nDetermines whether gRPC server responses with codes greater than or equal\nto 300 should be passed to a client\nor be intercepted and redirected to nginx for processing\nwith the [error\\_page](ngx_http_core_module.html#error_page) directive.\n\nSyntax:\n`grpc_next_upstream\n    error |\n    timeout |\n    invalid_header |\n    http_500 |\n    http_502 |\n    http_503 |\n    http_504 |\n    http_403 |\n    http_404 |\n    http_429 |\n    non_idempotent |\n    off\n    ...;`\n\nDefault:\n\n\n```\ngrpc_next_upstream error timeout;\n```\n\nContext:\n`http`, `server`, `location`\n\nSpecifies in which cases a request should be passed to the next server:\n\n`error`an error occurred while establishing a connection with the\nserver, passing a request to it, or reading the response header;`timeout`a timeout has occurred while establishing a connection with the\nserver, passing a request to it, or reading the response header;`invalid_header`a server returned an empty or invalid response;`http_500`a server returned a response with the code 500;`http_502`a server returned a response with the code 502;`http_503`a server returned a response with the code 503;`http_504`a server returned a response with the code 504;`http_403`a server returned a response with the code 403;`http_404`a server returned a response with the code 404;`http_429`a server returned a response with the code 429;`non_idempotent`normally, requests with a\n[non-idempotent](https://datatracker.ietf.org/doc/html/rfc7231#section-4.2.2)\nmethod\n( `POST`, `LOCK`, `PATCH`)\nare not passed to the next server\nif a request has been sent to an upstream server;\nenabling this option explicitly allows retrying such requests;\n`off`disables passing a request to the next server.\n\nOne should bear in mind that passing a request to the next server is\nonly possible if nothing has been sent to a client yet.\nThat is, if an error or timeout occurs in the middle of the\ntransferring of a response, fixing this is impossible.\n\nThe directive also defines what is considered an\n[unsuccessful\\\nattempt](ngx_http_upstream_module.html#max_fails) of communication with a server.\nThe cases of `error`, `timeout` and\n`invalid_header` are always considered unsuccessful attempts,\neven if they are not specified in the directive.\nThe cases of `http_500`, `http_502`,\n`http_503`, `http_504`,\nand `http_429` are\nconsidered unsuccessful attempts only if they are specified in the directive.\nThe cases of `http_403` and `http_404`\nare never considered unsuccessful attempts.\n\nPassing a request to the next server can be limited by\n[the number of tries](#grpc_next_upstream_tries)\nand by [time](#grpc_next_upstream_timeout).\n\nSyntax:\n`grpc_next_upstream_timeout time;`\n\nDefault:\n\n\n```\ngrpc_next_upstream_timeout 0;\n```\n\nContext:\n`http`, `server`, `location`\n\nLimits the time during which a request can be passed to the\n[next server](#grpc_next_upstream).\nThe `0` value turns off this limitation.\n\nSyntax:\n`grpc_next_upstream_tries number;`\n\nDefault:\n\n\n```\ngrpc_next_upstream_tries 0;\n```\n\nContext:\n`http`, `server`, `location`\n\nLimits the number of possible tries for passing a request to the\n[next server](#grpc_next_upstream).\nThe `0` value turns off this limitation.\n\nSyntax:\n`grpc_pass address;`\n\nDefault:\n\n\n—\n\n\nContext:\n`location`, `if in location`\n\nSets the gRPC server address.\nThe address can be specified as a domain name or IP address,\nand a port:\n\n\u003E ```\n\u003E grpc_pass localhost:9000;\n\u003E\n\u003E ```\n\n\nor as a UNIX-domain socket path:\n\n\u003E ```\n\u003E grpc_pass unix:/tmp/grpc.socket;\n\u003E\n\u003E ```\n\n\nAlternatively, the “ `grpc://`” scheme can be used:\n\n\u003E ```\n\u003E grpc_pass grpc://127.0.0.1:9000;\n\u003E\n\u003E ```\n\n\nTo use gRPC over SSL, the “ `grpcs://`” scheme should be used:\n\n\u003E ```\n\u003E grpc_pass grpcs://127.0.0.1:443;\n\u003E\n\u003E ```\n\nIf a domain name resolves to several addresses, all of them will be\nused in a round-robin fashion.\nIn addition, an address can be specified as a\n[server group](ngx_http_upstream_module.html).\n\nParameter value can contain variables (1.17.8).\nIn this case, if an address is specified as a domain name,\nthe name is searched among the described\n[server groups](ngx_http_upstream_module.html),\nand, if not found, is determined using a\n[resolver](ngx_http_core_module.html#resolver).\n\nSyntax:\n`grpc_pass_header field;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nPermits passing [otherwise disabled](#grpc_hide_header) header\nfields from a gRPC server to a client.\n\nSyntax:\n`grpc_read_timeout time;`\n\nDefault:\n\n\n```\ngrpc_read_timeout 60s;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines a timeout for reading a response from the gRPC server.\nThe timeout is set only between two successive read operations,\nnot for the transmission of the whole response.\nIf the gRPC server does not transmit anything within this time,\nthe connection is closed.\n\nSyntax:\n`grpc_send_timeout time;`\n\nDefault:\n\n\n```\ngrpc_send_timeout 60s;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets a timeout for transmitting a request to the gRPC server.\nThe timeout is set only between two successive write operations,\nnot for the transmission of the whole request.\nIf the gRPC server does not receive anything within this time,\nthe connection is closed.\n\nSyntax:\n`grpc_set_header field value;`\n\nDefault:\n\n\n```\ngrpc_set_header Content-Length $content_length;\n```\n\nContext:\n`http`, `server`, `location`\n\nAllows redefining or appending fields to the request header\n[passed](#grpc_pass_request_headers) to the gRPC server.\nThe `value` can contain text, variables, and their combinations.\nThese directives are inherited from the previous configuration level\nif and only if there are no `grpc_set_header` directives\ndefined on the current level.\n\nIf the value of a header field is an empty string then this\nfield will not be passed to a gRPC server:\n\n\u003E ```\n\u003E grpc_set_header Accept-Encoding \"\";\n\u003E\n\u003E ```\n\nSyntax:\n`grpc_socket_keepalive on | off;`\n\nDefault:\n\n\n```\ngrpc_socket_keepalive off;\n```\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in version 1.15.6.\n\n\nConfigures the “TCP keepalive” behavior\nfor outgoing connections to a gRPC server.\nBy default, the operating system’s settings are in effect for the socket.\nIf the directive is set to the value “ `on`”, the\n`SO_KEEPALIVE` socket option is turned on for the socket.\n\nSyntax:\n`grpc_ssl_certificate file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with the certificate in the PEM format\nused for authentication to a gRPC SSL server.\n\nSince version 1.21.0, variables can be used in the `file` name.\n\nSyntax:\n`grpc_ssl_certificate_key file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with the secret key in the PEM format\nused for authentication to a gRPC SSL server.\n\nThe value\n`engine`: `name`: `id`\ncan be specified instead of the `file`,\nwhich loads a secret key with a specified `id`\nfrom the OpenSSL engine `name`.\n\nSince version 1.21.0, variables can be used in the `file` name.\n\nSyntax:\n`grpc_ssl_ciphers ciphers;`\n\nDefault:\n\n\n```\ngrpc_ssl_ciphers DEFAULT;\n```\n\nContext:\n`http`, `server`, `location`\n\nSpecifies the enabled ciphers for requests to a gRPC SSL server.\nThe ciphers are specified in the format understood by the OpenSSL library.\n\nThe full list can be viewed using the\n“ `openssl ciphers`” command.\n\nSyntax:\n`grpc_ssl_conf_command name value;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in version 1.19.4.\n\n\nSets arbitrary OpenSSL configuration\n[commands](https://www.openssl.org/docs/man1.1.1/man3/SSL_CONF_cmd.html)\nwhen establishing a connection with the gRPC SSL server.\n\n\u003E The directive is supported when using OpenSSL 1.0.2 or higher.\n\nSeveral `grpc_ssl_conf_command` directives\ncan be specified on the same level.\nThese directives are inherited from the previous configuration level\nif and only if there are\nno `grpc_ssl_conf_command` directives\ndefined on the current level.\n\n\u003E Note that configuring OpenSSL directly\n\u003E might result in unexpected behavior.\n\nSyntax:\n`grpc_ssl_crl file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with revoked certificates (CRL)\nin the PEM format used to [verify](#grpc_ssl_verify)\nthe certificate of the gRPC SSL server.\n\nSyntax:\n`grpc_ssl_name name;`\n\nDefault:\n\n\n```\ngrpc_ssl_name host from grpc_pass;\n```\n\nContext:\n`http`, `server`, `location`\n\nAllows overriding the server name used to\n[verify](#grpc_ssl_verify)\nthe certificate of the gRPC SSL server and to be\n[passed through SNI](#grpc_ssl_server_name)\nwhen establishing a connection with the gRPC SSL server.\n\nBy default, the host part from [grpc\\_pass](#grpc_pass) is used.\n\nSyntax:\n`grpc_ssl_password_file file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with passphrases for\n[secret keys](#grpc_ssl_certificate_key)\nwhere each passphrase is specified on a separate line.\nPassphrases are tried in turn when loading the key.\n\nSyntax:\n`grpc_ssl_protocols\n    [SSLv2]\n    [SSLv3]\n    [TLSv1]\n    [TLSv1.1]\n    [TLSv1.2]\n    [TLSv1.3];`\n\nDefault:\n\n\n```\ngrpc_ssl_protocols TLSv1 TLSv1.1 TLSv1.2 TLSv1.3;\n```\n\nContext:\n`http`, `server`, `location`\n\nEnables the specified protocols for requests to a gRPC SSL server.\n\n\u003E The `TLSv1.3` parameter is used by default\n\u003E since 1.23.4.\n\nSyntax:\n`grpc_ssl_server_name on | off;`\n\nDefault:\n\n\n```\ngrpc_ssl_server_name off;\n```\n\nContext:\n`http`, `server`, `location`\n\nEnables or disables passing of the server name through\n[TLS\\\nServer Name Indication extension](http://en.wikipedia.org/wiki/Server_Name_Indication) (SNI, RFC 6066)\nwhen establishing a connection with the gRPC SSL server.\n\nSyntax:\n`grpc_ssl_session_reuse on | off;`\n\nDefault:\n\n\n```\ngrpc_ssl_session_reuse on;\n```\n\nContext:\n`http`, `server`, `location`\n\nDetermines whether SSL sessions can be reused when working with\nthe gRPC server.\nIf the errors\n“ `SSL3_GET_FINISHED:digest check failed`”\nappear in the logs, try disabling session reuse.\n\nSyntax:\n`grpc_ssl_trusted_certificate file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with trusted CA certificates in the PEM format\nused to [verify](#grpc_ssl_verify)\nthe certificate of the gRPC SSL server.\n\nSyntax:\n`grpc_ssl_verify on | off;`\n\nDefault:\n\n\n```\ngrpc_ssl_verify off;\n```\n\nContext:\n`http`, `server`, `location`\n\nEnables or disables verification of the gRPC SSL server certificate.\n\nSyntax:\n`grpc_ssl_verify_depth number;`\n\nDefault:\n\n\n```\ngrpc_ssl_verify_depth 1;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets the verification depth in the gRPC SSL server certificates chain.",
              "role": "user"
            }
          ],
          "model": "gpt-4o-mini",
          "temperature": 0
        },
        "toolMapping": {

        }
      },
      "llmResponse": null,
      "output": [
        {
          "content": "Waiting for model response...",
          "subCalls": null
        }
      ],
      "start": "2024-11-18T20:20:09.328526477Z",
      "tool": {
        "description": "Removes extra header, footer, and navigation content from the markdown version of webpages",
        "id": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner",
        "instructions": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
        "internalPrompt": null,
        "localTools": {
          "website markdown content cleaner": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner"
        },
        "modelName": "gpt-4o-mini",
        "name": "Website Markdown Content Cleaner",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/website-cleaner/tool.gpt"
        },
        "workingDir": "/otto8-tools/website-cleaner"
      },
      "toolResults": 0,
      "type": "callProgress",
      "usage": {

      }
    },
    "1731961918": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "Running sys.daemon",
      "end": "2024-11-18T20:20:09.328733939Z",
      "id": "1731961918",
      "input": "",
      "inputContext": null,
      "llmRequest": null,
      "llmResponse": null,
      "output": [
        {
          "content": "http://127.0.0.1:11184",
          "subCalls": null
        }
      ],
      "start": "2024-11-18T20:20:09.328698589Z",
      "tool": {
        "description": "Model provider for Otto8",
        "id": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8",
        "instructions": "#!sys.daemon /usr/bin/env python3 ${GPTSCRIPT_TOOL_DIR}/main.py",
        "internalPrompt": null,
        "localTools": {
          "otto8": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8"
        },
        "modelName": "gpt-4o",
        "modelProvider": true,
        "name": "Otto8",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/otto8-model-provider/tool.gpt"
        },
        "workingDir": "/otto8-tools/otto8-model-provider"
      },
      "toolCategory": "provider",
      "toolResults": 0,
      "type": "callFinish",
      "usage": {

      }
    }
  },
  "spec": {
    "synchronous": true,
    "threadName": "t1-ks1r25px",
    "input": "Celebrating [20 years](https://github.com/nginx/nginx/commit/0e8348c50)\nof nginx!\nRead about our journey and milestones in the\n[latest blog](https://blog.nginx.org/blog/celebrating-20-years-of-nginx).\n\n# [![NGINX](/img/nginx_logo.png)\\ ![NGINX](/img/nginx_logo_dark.png)](/)\n\n- english\n\n- [русский](../../../ru/docs/http/ngx_http_grpc_module.html)\n- [news](../../../news.html)\n- [about](../../../en/)\n- [download](../../../en/download.html)\n- [security](../../../en/security_advisories.html)\n- [documentation](../../../en/docs/)\n- [faq](../../../en/docs/faq.html)\n- [books](../../../en/books.html)\n- [community](../../../en/community.html)\n- [enterprise](../../../en/enterprise.html)\n- [x.com](https://x.com/nginxorg)\n- [blog](https://blog.nginx.org/)\n- [unit](https://unit.nginx.org/)\n- [njs](../../../en/docs/njs/)\n\n## Module ngx\\_http\\_grpc\\_module\n\n[Example Configuration](#example)\n\n[Directives](#directives)\n\n[grpc\\_bind](#grpc_bind)\n\n[grpc\\_buffer\\_size](#grpc_buffer_size)\n\n[grpc\\_connect\\_timeout](#grpc_connect_timeout)\n\n[grpc\\_hide\\_header](#grpc_hide_header)\n\n[grpc\\_ignore\\_headers](#grpc_ignore_headers)\n\n[grpc\\_intercept\\_errors](#grpc_intercept_errors)\n\n[grpc\\_next\\_upstream](#grpc_next_upstream)\n\n[grpc\\_next\\_upstream\\_timeout](#grpc_next_upstream_timeout)\n\n[grpc\\_next\\_upstream\\_tries](#grpc_next_upstream_tries)\n\n[grpc\\_pass](#grpc_pass)\n\n[grpc\\_pass\\_header](#grpc_pass_header)\n\n[grpc\\_read\\_timeout](#grpc_read_timeout)\n\n[grpc\\_send\\_timeout](#grpc_send_timeout)\n\n[grpc\\_set\\_header](#grpc_set_header)\n\n[grpc\\_socket\\_keepalive](#grpc_socket_keepalive)\n\n[grpc\\_ssl\\_certificate](#grpc_ssl_certificate)\n\n[grpc\\_ssl\\_certificate\\_key](#grpc_ssl_certificate_key)\n\n[grpc\\_ssl\\_ciphers](#grpc_ssl_ciphers)\n\n[grpc\\_ssl\\_conf\\_command](#grpc_ssl_conf_command)\n\n[grpc\\_ssl\\_crl](#grpc_ssl_crl)\n\n[grpc\\_ssl\\_name](#grpc_ssl_name)\n\n[grpc\\_ssl\\_password\\_file](#grpc_ssl_password_file)\n\n[grpc\\_ssl\\_protocols](#grpc_ssl_protocols)\n\n[grpc\\_ssl\\_server\\_name](#grpc_ssl_server_name)\n\n[grpc\\_ssl\\_session\\_reuse](#grpc_ssl_session_reuse)\n\n[grpc\\_ssl\\_trusted\\_certificate](#grpc_ssl_trusted_certificate)\n\n[grpc\\_ssl\\_verify](#grpc_ssl_verify)\n\n[grpc\\_ssl\\_verify\\_depth](#grpc_ssl_verify_depth)\n\nThe `ngx_http_grpc_module` module allows passing requests\nto a gRPC server (1.13.10).\nThe module requires the\n[ngx\\_http\\_v2\\_module](ngx_http_v2_module.html) module.\n\n#### Example Configuration\n\n\u003E ```\n\u003E server {\n\u003E     listen 9000;\n\u003E\n\u003E     http2 on;\n\u003E\n\u003E     location / {\n\u003E         grpc_pass 127.0.0.1:9000;\n\u003E     }\n\u003E }\n\u003E\n\u003E ```\n\n#### Directives\n\nSyntax:\n`grpc_bind\n    address\n    [transparent ] |\n    off;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nMakes outgoing connections to a gRPC server originate\nfrom the specified local IP address with an optional port.\nParameter value can contain variables.\nThe special value `off` cancels the effect\nof the `grpc_bind` directive\ninherited from the previous configuration level, which allows the\nsystem to auto-assign the local IP address and port.\n\nThe `transparent` parameter allows\noutgoing connections to a gRPC server originate\nfrom a non-local IP address,\nfor example, from a real IP address of a client:\n\n\u003E ```\n\u003E grpc_bind $remote_addr transparent;\n\u003E\n\u003E ```\n\n\nIn order for this parameter to work,\nit is usually necessary to run nginx worker processes with the\n[superuser](../ngx_core_module.html#user) privileges.\nOn Linux it is not required as if\nthe `transparent` parameter is specified, worker processes\ninherit the `CAP_NET_RAW` capability from the master process.\nIt is also necessary to configure kernel routing table\nto intercept network traffic from the gRPC server.\n\nSyntax:\n`grpc_buffer_size size;`\n\nDefault:\n\n\n```\ngrpc_buffer_size 4k|8k;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets the `size` of the buffer used for reading the response\nreceived from the gRPC server.\nThe response is passed to the client synchronously, as soon as it is received.\n\nSyntax:\n`grpc_connect_timeout time;`\n\nDefault:\n\n\n```\ngrpc_connect_timeout 60s;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines a timeout for establishing a connection with a gRPC server.\nIt should be noted that this timeout cannot usually exceed 75 seconds.\n\nSyntax:\n`grpc_hide_header field;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nBy default,\nnginx does not pass the header fields “Date”,\n“Server”, and\n“X-Accel-...” from the response of a gRPC\nserver to a client.\nThe `grpc_hide_header` directive sets additional fields\nthat will not be passed.\nIf, on the contrary, the passing of fields needs to be permitted,\nthe [grpc\\_pass\\_header](#grpc_pass_header) directive can be used.\n\nSyntax:\n`grpc_ignore_headers field ...;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nDisables processing of certain response header fields from the gRPC server.\nThe following fields can be ignored: “X-Accel-Redirect”\nand “X-Accel-Charset”.\n\nIf not disabled, processing of these header fields has the following\neffect:\n\n- “X-Accel-Redirect” performs an\n[internal\\\nredirect](ngx_http_core_module.html#internal) to the specified URI;\n\n- “X-Accel-Charset” sets the desired\n[charset](ngx_http_charset_module.html#charset)\nof a response.\n\n\nSyntax:\n`grpc_intercept_errors on | off;`\n\nDefault:\n\n\n```\ngrpc_intercept_errors off;\n```\n\nContext:\n`http`, `server`, `location`\n\nDetermines whether gRPC server responses with codes greater than or equal\nto 300 should be passed to a client\nor be intercepted and redirected to nginx for processing\nwith the [error\\_page](ngx_http_core_module.html#error_page) directive.\n\nSyntax:\n`grpc_next_upstream\n    error |\n    timeout |\n    invalid_header |\n    http_500 |\n    http_502 |\n    http_503 |\n    http_504 |\n    http_403 |\n    http_404 |\n    http_429 |\n    non_idempotent |\n    off\n    ...;`\n\nDefault:\n\n\n```\ngrpc_next_upstream error timeout;\n```\n\nContext:\n`http`, `server`, `location`\n\nSpecifies in which cases a request should be passed to the next server:\n\n`error`an error occurred while establishing a connection with the\nserver, passing a request to it, or reading the response header;`timeout`a timeout has occurred while establishing a connection with the\nserver, passing a request to it, or reading the response header;`invalid_header`a server returned an empty or invalid response;`http_500`a server returned a response with the code 500;`http_502`a server returned a response with the code 502;`http_503`a server returned a response with the code 503;`http_504`a server returned a response with the code 504;`http_403`a server returned a response with the code 403;`http_404`a server returned a response with the code 404;`http_429`a server returned a response with the code 429;`non_idempotent`normally, requests with a\n[non-idempotent](https://datatracker.ietf.org/doc/html/rfc7231#section-4.2.2)\nmethod\n( `POST`, `LOCK`, `PATCH`)\nare not passed to the next server\nif a request has been sent to an upstream server;\nenabling this option explicitly allows retrying such requests;\n`off`disables passing a request to the next server.\n\nOne should bear in mind that passing a request to the next server is\nonly possible if nothing has been sent to a client yet.\nThat is, if an error or timeout occurs in the middle of the\ntransferring of a response, fixing this is impossible.\n\nThe directive also defines what is considered an\n[unsuccessful\\\nattempt](ngx_http_upstream_module.html#max_fails) of communication with a server.\nThe cases of `error`, `timeout` and\n`invalid_header` are always considered unsuccessful attempts,\neven if they are not specified in the directive.\nThe cases of `http_500`, `http_502`,\n`http_503`, `http_504`,\nand `http_429` are\nconsidered unsuccessful attempts only if they are specified in the directive.\nThe cases of `http_403` and `http_404`\nare never considered unsuccessful attempts.\n\nPassing a request to the next server can be limited by\n[the number of tries](#grpc_next_upstream_tries)\nand by [time](#grpc_next_upstream_timeout).\n\nSyntax:\n`grpc_next_upstream_timeout time;`\n\nDefault:\n\n\n```\ngrpc_next_upstream_timeout 0;\n```\n\nContext:\n`http`, `server`, `location`\n\nLimits the time during which a request can be passed to the\n[next server](#grpc_next_upstream).\nThe `0` value turns off this limitation.\n\nSyntax:\n`grpc_next_upstream_tries number;`\n\nDefault:\n\n\n```\ngrpc_next_upstream_tries 0;\n```\n\nContext:\n`http`, `server`, `location`\n\nLimits the number of possible tries for passing a request to the\n[next server](#grpc_next_upstream).\nThe `0` value turns off this limitation.\n\nSyntax:\n`grpc_pass address;`\n\nDefault:\n\n\n—\n\n\nContext:\n`location`, `if in location`\n\nSets the gRPC server address.\nThe address can be specified as a domain name or IP address,\nand a port:\n\n\u003E ```\n\u003E grpc_pass localhost:9000;\n\u003E\n\u003E ```\n\n\nor as a UNIX-domain socket path:\n\n\u003E ```\n\u003E grpc_pass unix:/tmp/grpc.socket;\n\u003E\n\u003E ```\n\n\nAlternatively, the “ `grpc://`” scheme can be used:\n\n\u003E ```\n\u003E grpc_pass grpc://127.0.0.1:9000;\n\u003E\n\u003E ```\n\n\nTo use gRPC over SSL, the “ `grpcs://`” scheme should be used:\n\n\u003E ```\n\u003E grpc_pass grpcs://127.0.0.1:443;\n\u003E\n\u003E ```\n\nIf a domain name resolves to several addresses, all of them will be\nused in a round-robin fashion.\nIn addition, an address can be specified as a\n[server group](ngx_http_upstream_module.html).\n\nParameter value can contain variables (1.17.8).\nIn this case, if an address is specified as a domain name,\nthe name is searched among the described\n[server groups](ngx_http_upstream_module.html),\nand, if not found, is determined using a\n[resolver](ngx_http_core_module.html#resolver).\n\nSyntax:\n`grpc_pass_header field;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nPermits passing [otherwise disabled](#grpc_hide_header) header\nfields from a gRPC server to a client.\n\nSyntax:\n`grpc_read_timeout time;`\n\nDefault:\n\n\n```\ngrpc_read_timeout 60s;\n```\n\nContext:\n`http`, `server`, `location`\n\nDefines a timeout for reading a response from the gRPC server.\nThe timeout is set only between two successive read operations,\nnot for the transmission of the whole response.\nIf the gRPC server does not transmit anything within this time,\nthe connection is closed.\n\nSyntax:\n`grpc_send_timeout time;`\n\nDefault:\n\n\n```\ngrpc_send_timeout 60s;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets a timeout for transmitting a request to the gRPC server.\nThe timeout is set only between two successive write operations,\nnot for the transmission of the whole request.\nIf the gRPC server does not receive anything within this time,\nthe connection is closed.\n\nSyntax:\n`grpc_set_header field value;`\n\nDefault:\n\n\n```\ngrpc_set_header Content-Length $content_length;\n```\n\nContext:\n`http`, `server`, `location`\n\nAllows redefining or appending fields to the request header\n[passed](#grpc_pass_request_headers) to the gRPC server.\nThe `value` can contain text, variables, and their combinations.\nThese directives are inherited from the previous configuration level\nif and only if there are no `grpc_set_header` directives\ndefined on the current level.\n\nIf the value of a header field is an empty string then this\nfield will not be passed to a gRPC server:\n\n\u003E ```\n\u003E grpc_set_header Accept-Encoding \"\";\n\u003E\n\u003E ```\n\nSyntax:\n`grpc_socket_keepalive on | off;`\n\nDefault:\n\n\n```\ngrpc_socket_keepalive off;\n```\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in version 1.15.6.\n\n\nConfigures the “TCP keepalive” behavior\nfor outgoing connections to a gRPC server.\nBy default, the operating system’s settings are in effect for the socket.\nIf the directive is set to the value “ `on`”, the\n`SO_KEEPALIVE` socket option is turned on for the socket.\n\nSyntax:\n`grpc_ssl_certificate file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with the certificate in the PEM format\nused for authentication to a gRPC SSL server.\n\nSince version 1.21.0, variables can be used in the `file` name.\n\nSyntax:\n`grpc_ssl_certificate_key file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with the secret key in the PEM format\nused for authentication to a gRPC SSL server.\n\nThe value\n`engine`: `name`: `id`\ncan be specified instead of the `file`,\nwhich loads a secret key with a specified `id`\nfrom the OpenSSL engine `name`.\n\nSince version 1.21.0, variables can be used in the `file` name.\n\nSyntax:\n`grpc_ssl_ciphers ciphers;`\n\nDefault:\n\n\n```\ngrpc_ssl_ciphers DEFAULT;\n```\n\nContext:\n`http`, `server`, `location`\n\nSpecifies the enabled ciphers for requests to a gRPC SSL server.\nThe ciphers are specified in the format understood by the OpenSSL library.\n\nThe full list can be viewed using the\n“ `openssl ciphers`” command.\n\nSyntax:\n`grpc_ssl_conf_command name value;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nThis directive appeared in version 1.19.4.\n\n\nSets arbitrary OpenSSL configuration\n[commands](https://www.openssl.org/docs/man1.1.1/man3/SSL_CONF_cmd.html)\nwhen establishing a connection with the gRPC SSL server.\n\n\u003E The directive is supported when using OpenSSL 1.0.2 or higher.\n\nSeveral `grpc_ssl_conf_command` directives\ncan be specified on the same level.\nThese directives are inherited from the previous configuration level\nif and only if there are\nno `grpc_ssl_conf_command` directives\ndefined on the current level.\n\n\u003E Note that configuring OpenSSL directly\n\u003E might result in unexpected behavior.\n\nSyntax:\n`grpc_ssl_crl file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with revoked certificates (CRL)\nin the PEM format used to [verify](#grpc_ssl_verify)\nthe certificate of the gRPC SSL server.\n\nSyntax:\n`grpc_ssl_name name;`\n\nDefault:\n\n\n```\ngrpc_ssl_name host from grpc_pass;\n```\n\nContext:\n`http`, `server`, `location`\n\nAllows overriding the server name used to\n[verify](#grpc_ssl_verify)\nthe certificate of the gRPC SSL server and to be\n[passed through SNI](#grpc_ssl_server_name)\nwhen establishing a connection with the gRPC SSL server.\n\nBy default, the host part from [grpc\\_pass](#grpc_pass) is used.\n\nSyntax:\n`grpc_ssl_password_file file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with passphrases for\n[secret keys](#grpc_ssl_certificate_key)\nwhere each passphrase is specified on a separate line.\nPassphrases are tried in turn when loading the key.\n\nSyntax:\n`grpc_ssl_protocols\n    [SSLv2]\n    [SSLv3]\n    [TLSv1]\n    [TLSv1.1]\n    [TLSv1.2]\n    [TLSv1.3];`\n\nDefault:\n\n\n```\ngrpc_ssl_protocols TLSv1 TLSv1.1 TLSv1.2 TLSv1.3;\n```\n\nContext:\n`http`, `server`, `location`\n\nEnables the specified protocols for requests to a gRPC SSL server.\n\n\u003E The `TLSv1.3` parameter is used by default\n\u003E since 1.23.4.\n\nSyntax:\n`grpc_ssl_server_name on | off;`\n\nDefault:\n\n\n```\ngrpc_ssl_server_name off;\n```\n\nContext:\n`http`, `server`, `location`\n\nEnables or disables passing of the server name through\n[TLS\\\nServer Name Indication extension](http://en.wikipedia.org/wiki/Server_Name_Indication) (SNI, RFC 6066)\nwhen establishing a connection with the gRPC SSL server.\n\nSyntax:\n`grpc_ssl_session_reuse on | off;`\n\nDefault:\n\n\n```\ngrpc_ssl_session_reuse on;\n```\n\nContext:\n`http`, `server`, `location`\n\nDetermines whether SSL sessions can be reused when working with\nthe gRPC server.\nIf the errors\n“ `SSL3_GET_FINISHED:digest check failed`”\nappear in the logs, try disabling session reuse.\n\nSyntax:\n`grpc_ssl_trusted_certificate file;`\n\nDefault:\n\n\n—\n\n\nContext:\n`http`, `server`, `location`\n\nSpecifies a `file` with trusted CA certificates in the PEM format\nused to [verify](#grpc_ssl_verify)\nthe certificate of the gRPC SSL server.\n\nSyntax:\n`grpc_ssl_verify on | off;`\n\nDefault:\n\n\n```\ngrpc_ssl_verify off;\n```\n\nContext:\n`http`, `server`, `location`\n\nEnables or disables verification of the gRPC SSL server certificate.\n\nSyntax:\n`grpc_ssl_verify_depth number;`\n\nDefault:\n\n\n```\ngrpc_ssl_verify_depth 1;\n```\n\nContext:\n`http`, `server`, `location`\n\nSets the verification depth in the gRPC SSL server certificates chain.",
    "tool": "\"website-cleaner\""
  },
  "status": {
    "state": "error",
    "output": "",
    "endTime": "2024-11-18T20:20:09Z",
    "error": "run encountered an error: failed to read events: context canceled with error output: "
  }
}
thedadams commented 1 week ago

Thanks for retesting. When this error occurs, then knowledge needs to retry. Thorsten and I talked to Darren about this.

iwilltry42 commented 1 week ago

For clarity: This has to be done in the https://github.com/otto8-ai/tools/tree/main/website-cleaner or in the calling code at https://github.com/otto8-ai/otto8/blob/main/pkg/controller/handlers/knowledgefile/knowledgefile.go#L191

I think I'm going to introduce two steps here:

  1. Pre-Process the HTML in the crawler/otto8/website-cleaner-tool by stripping out obviously header/footer/navigation related HTML tags to minify the payload sent to the model (also increases the likelihood of using it in the first place due to reduced size).
  2. Add retry logic in otto8 (to not overcomplicate the websiter-cleaner tool)
sangee2004 commented 3 hours ago

Encountered the following ingestion error when testing with latest version - "github.com/otto8-ai/tools": "245e7fdc11e6fcaf69200953bb7bdc8af0e40fdd", "otto": "v0.0.0-dev-1bfd667c-dirty"

failed to clean website content: failed to run: failed calling model for completion: error, The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error.

debug logs relating to this ingestion error -

{
  "frames": {
    "1732750508": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "",
      "end": "0001-01-01T00:00:00Z",
      "id": "1732750508",
      "input": "[![Acorn Labs](https://www.acorn.io/wp-content/uploads/2024/10/acorn_logo_h_5b9f9fcaf6.svg)](https://www.acorn.io/)\n\n![](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2020%2014'%3E%3C/svg%3E)\n\nMenu Close\n\n- [Resources](https://www.acorn.io/resources/)\n  \n  - [Resources](https://www.acorn.io/resources/)\n    \n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Tutorials](/resources/tutorials)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Blog](/resources/blog)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Tools](http://tools.gptscript.ai/)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      GitHub](https://github.com/gptscript-ai/gptscript)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Events](/events)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Docs](https://docs.gptscript.ai/)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Discord](https://discord.com/invite/9sSf4UyAMC)\n  - [Learning Center](/resources/learning-center)\n    \n    - [Models](https://www.acorn.io/resources/learning-center/category/models/)\n      \n      - [OpenAI GPT-4](https://www.acorn.io/resources/learning-center/openai/)\n      - [Anthropic Claude](https://www.acorn.io/resources/learning-center/anthropic-claude/)\n      - [Cohere AI](https://www.acorn.io/resources/learning-center/cohere-ai/)\n      - [Google Gemini](https://www.acorn.io/resources/learning-center/google-gemini/)\n      - [Meta LLaMa](https://www.acorn.io/resources/learning-center/meta-llama/)\n      - [Mistral](https://www.acorn.io/resources/learning-center/mistral-ai/)\n      - [Mistral 7B](https://www.acorn.io/resources/learning-center/mistral-7b/)\n    - [Tools and Topics](https://www.acorn.io/resources/learning-center/category/tools-and-topics/)\n      \n      - [Fine-Tuning LLMs](https://www.acorn.io/resources/learning-center/fine-tuning-llm/)\n      - [Generative AI](https://www.acorn.io/resources/learning-center/generative-ai-applications/)\n      - [AI Agents](https://www.acorn.io/resources/learning-center/ai-agents/)\n      - [Claude API](https://www.acorn.io/resources/learning-center/claude-api/)\n      - [Gemini API](https://www.acorn.io/resources/learning-center/google-gemini-api/)\n      - [LLM Application Development](https://www.acorn.io/resources/learning-center/llm-application-development/)\n      - [LLM Security](https://www.acorn.io/resources/learning-center/llm-security/)\n      - [Prompt Engineering](https://www.acorn.io/resources/learning-center/prompt-engineering/)\n    - [Use Cases](https://www.acorn.io/resources/learning-center/category/use-cases/)\n      \n      - [Retrieval Augmented Generation (RAG)](https://www.acorn.io/resources/learning-center/retrieval-augmented-generation/)\n      - [AI Copilots](https://www.acorn.io/resources/learning-center/ai-copilots/)\n      - [AI Image](https://www.acorn.io/resources/learning-center/ai-image-generation/)\n      - [AI Video Generators](https://www.acorn.io/resources/learning-center/ai-video-generators/)\n      - [AI Summarization](https://www.acorn.io/resources/learning-center/ai-summarization/)\n      - [Code Interpreter](https://www.acorn.io/resources/learning-center/code-interpreter/)\n    \n    [Explore All Articles](/resources/learning-center)\n- [Docs](https://docs.gptscript.ai/)\n- [Tools](http://tools.gptscript.ai/)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [GitHub](https://github.com/gptscript-ai/gptscript)\n- [Company](https://www.acorn.io/about-us/)\n\n[Try GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release)\n\n[Request a Demo](/contact)\n\n[Learning Center](/resources/learning-center)\n\n# Fine-Tuning LLMs: Top 6 Methods, Challenges and Best Practices\n\n###### June 6, 2024 by acorn labs\n\n## What Does It Mean to Fine-Tune LLMs?\n\nFine-tuning Large Language Models (LLMs) involves adjusting pre-trained models on specific datasets to enhance performance for particular tasks. This process begins after general training ends. Users provide the model with a more focused dataset, which may include industry-specific terminology or task-focused interactions, with the objective of helping the model generate more relevant responses for a specific use case.\n\nFine-tuning allows the model to adapt its pre-existing weights and biases to fit specific problems better. This results in improved accuracy and relevance in outputs, making LLMs more effective in practical, specialized applications than their broadly trained counterparts. While fine-tuning can be highly computationally intensive, new techniques like Parameter-Efficient Fine-Tuning (PEFT) are making it much more efficient and possible to run even on consumer hardware.\n\nFine-tuning can be performed both on open source LLMs, such as Meta LLaMA and Mistral models, and on some commercial LLMs, if this capability is offered by the model’s developer. *(Learn more in our detailed guide to [fine tuning Llama 2](https://www.acorn.io/resources/learning-center/fine-tuning-llama-2))*. For example, OpenAI allows fine tuning for GPT-3.5 and GPT-4. This is part of an extensive series of guides about [machine learning](https://www.aporia.com/learn/machine-learning-model/machine-learning-models-use-cases-operations/).\n\n## Fine-Tuning vs. Embeddings vs. Prompt Engineering\n\n**Fine-tuning** is a method where a pre-trained model is further trained (or fine tuned) on a new dataset specific to a particular task. This technique involves adjusting the weights across all layers of the model, based on the new data. It allows the model to specifically cater to nuanced tasks and often results in higher performance for specialized applications.\n\n**Embeddings** refer to dense vector representations of words or phrases, which are typically obtained during the initial training of a model. Instead of adjusting the entire model, embeddings can be extracted and used as static input features for various downstream tasks. This approach does not modify the pre-trained model but leverages the learned representations. It’s generally faster and less resource-intensive than fine-tuning.\n\n**Prompt engineering** is another way to adjust LLMs to specific tasks. Adding more context, examples, or even entire documents and rich media, to LLM prompts can cause models to provide much more nuanced and relevant responses to specific tasks. Prompt engineering is considered more limited than fine-tuning, but is also much less technically complex and is not computationally intensive.\n\n***Learn more in our detailed guide to LLM fine tuning vs embedding (coming soon)***\n\n## When Does Your Business Need a Fine-Tuned Model?\n\nHere are a few primary use cases for fine-tuned LLMs:\n\n### Specificity and Relevance\n\nA fine-tuned model excels in providing highly specific and relevant outputs tailored to your business’s unique needs. Unlike general models, which offer broad responses, fine-tuning adapts the model to understand industry-specific terminology and nuances. This can be particularly beneficial for specialized industries like legal, medical, or technical fields where precise language and contextual understanding are crucial.\n\n### Improved Accuracy\n\nFine-tuning significantly enhances the accuracy of a language model by allowing it to adapt to the specific patterns and requirements of your business data. When a model is fine-tuned, it learns from a curated dataset that mirrors the particular tasks and language your business encounters. This focused learning process refines the model’s ability to generate precise and contextually appropriate responses, reducing errors and increasing the reliability of the outputs.\n\n### Data Privacy and Security\n\nIn many industries, maintaining data privacy and security is paramount. By fine-tuning a language model on proprietary or sensitive data, businesses can ensure that their unique datasets are not exposed to third-party risks associated with general model training environments. Fine-tuning can be conducted on-premises or within secure environments, keeping data control in-house.\n\n### Customized Interactions\n\nBusinesses that require highly personalized customer interactions can significantly benefit from fine-tuned models. These models can be trained to understand and respond to customer queries with a level of customization that aligns with the brand’s voice and customer service protocols. For instance, a fine-tuned model in a retail business can understand product-specific inquiries, offer personalized recommendations, understand company policies, and handle complex service issues more effectively than a general model.\n\n## Top 6 LLM Fine-Tuning Methods\n\nHere are some of the ways that large language models can be fine tuned.\n\n### 1. Instruction Fine-Tuning\n\nInstruction fine-tuning involves training a model using examples that demonstrate how it should respond to specific queries. For instance, to improve summarization skills, a dataset with instructions like \"summarize this text\" followed by the actual text is used.\n\nThis method helps the model learn to follow specific instructions and improve its performance in targeted tasks by understanding the expected outputs from given prompts. This approach is particularly useful for enhancing the model’s ability to handle various task-specific instructions effectively.\n\n### 2. Parameter-Efficient Fine-Tuning (PEFT)\n\nPEFT updates only a small subset of the model’s parameters during training, significantly reducing the memory and computational requirements compared to full fine-tuning. Techniques like LoRA (Low-Rank Adaptation) and QLoRA (Quantized Low-Rank Adaptation) can reduce the number of trainable parameters by thousands of times.\n\nThis method helps manage hardware limitations and prevents the phenomenon of ‘catastrophic forgetting’, maintaining the model’s original knowledge while adapting to new tasks. By focusing on specific components, PEFT makes the fine-tuning process more efficient and cost-effective, especially for large models.\n\n### 3. Task-Specific Fine-Tuning\n\nTask-specific fine-tuning focuses on adjusting a pre-trained model to excel in a particular task or domain using a dedicated dataset. This method typically requires more data and time than transfer learning but achieves higher performance in specific tasks, such as translation or sentiment analysis.\n\nDespite its effectiveness, it can lead to catastrophic forgetting, where the model loses proficiency in tasks it was previously trained on. However, by tailoring the model to specific requirements, task-specific fine-tuning ensures high accuracy and relevance for specialized applications.\n\n### 4. Transfer Learning\n\nTransfer learning leverages a model trained on a broad, general-purpose dataset and adapts it to specific tasks using task-specific data. This method is useful when data or resources are limited, as it builds on the knowledge already embedded in the pre-trained model, offering improved learning rates and accuracy with less training time compared to training from scratch.\n\nTransfer learning enables the efficient reuse of models like GPT or BERT for new applications, providing a strong foundation for further customization.\n\n***Learn more in our detailed guide to fine tuning vs transfer learning (coming soon)***\n\n### 5. Multi-Task Learning\n\nMulti-task learning trains a model on a dataset containing examples for multiple tasks, such as summarization, code translation, and entity recognition. This approach helps the model improve performance across different tasks simultaneously and avoids catastrophic forgetting.\n\nHowever, this method requires a large amount of diverse data, which can be challenging to assemble. The comprehensive training enables the model to handle various tasks proficiently, making it suitable for environments where versatile performance is necessary.\n\n### 6. Sequential Fine-Tuning\n\nSequential fine-tuning adapts a model to a series of related tasks in stages. For example, a general language model might first be fine-tuned for medical language and subsequently for pediatric cardiology. This method ensures the model retains its performance across various specialized domains, building on each successive fine-tuning step to refine its capabilities further.\n\nBy sequentially adapting to increasingly specific datasets, the model can achieve high proficiency in niche areas while maintaining a broad understanding of the general domain.\n\n## What Is Retrieval Augmented Generation (RAG)?\n\nRetrieval Augmented Generation (RAG) is a technique that combines natural language generation with information retrieval to enhance a model’s outputs with up-to-date and contextually relevant information. RAG integrates external knowledge sources, ensuring that the language model provides accurate and current responses. This method is particularly useful for tasks requiring precise, timely information, as it allows continuous updates and easy management of the knowledge base, avoiding the rigidity of traditional fine-tuning methods.\n\nRAG systems can dynamically retrieve information during generation, making them highly adaptable to changing data and capable of delivering more relevant and informed outputs. This technique is beneficial for applications where the accuracy and freshness of information are critical, such as customer support, content creation, and research. By leveraging RAG, businesses can ensure their language models remain current and provide high-quality responses that are well-grounded in the latest information available.\n\nNotable examples of the use of RAG are the [AI Overviews](https://blog.google/products/search/generative-ai-google-search-may-2024/) feature in Google search, and [Microsoft Copilot in Bing](https://www.bing.com/chat?form=CONVRD), both of which extract data from a live index of the Internet and use it as an input for LLM responses.\n\n## How to Choose a Pre-Trained Model for Fine-Tuning\n\nHere’s an overview of the process of identifying an existing LLM for fine-tuning.\n\n### Define the Task\n\nBefore fine-tuning, clearly define the model’s intended task. Understanding the task’s requirements helps in selecting a model whose pre-trained capabilities align closely with the end objectives. For example, it may involve classification, regression, or generative tasks.\n\nAn accurate task definition also aids in determining the necessary data scope for model fine-tuning. This can prevent potential performance degradation due to underfitting or overfitting during the fine-tuning phase.\n\n### Understand the Model Architecture\n\nGet familiar with different model architectures to select the most suitable one for your task. Each architecture has strengths and limitations based on its design principles, layers, and the type of data it was initially trained on.\n\nUnderstanding these characteristics can significantly impact the success of fine-tuning, as certain architectures might be more compatible with the nature of your specific tasks.\n\n### Assess Strengths and Weaknesses\n\nEvaluate the strengths and weaknesses of the model options. Some models may excel at handling text-based tasks while others may be optimized for voice or image recognition tasks. Standardized benchmarks, which you can find on LLM leaderboards, can help compare models on parameters relevant to your project.\n\nAdditionally, consider the model’s performance trade-offs such as accuracy, processing speed, and memory usage, which can affect the practical deployment of the fine tuned model in real-world applications.\n\n### Match with Task Requirements\n\nEnsure the pre-trained model’s capabilities match the demands of the task. This involves comparing the model’s training data, learning capabilities, and output formats with what’s needed for your use case. A close match between the model’s training conditions and your task’s requirements can enhance the effectiveness of the re-training process.\n\n***Related content: Read our guide to fine tuning LLM tutorial (coming soon)***\n\n## Challenges and Limitations of LLM Fine-Tuning\n\nHere are some of the challenges involved in fine-tuning large language models.\n\n### Overfitting\n\nOverfitting occurs when a model is trained so closely to the nuances of a specific dataset that it performs exceptionally well on that data but poorly on any data it hasn’t seen before. This is particularly problematic in fine-tuning because the datasets used are generally smaller and more specialized than those used in initial broad training phases.\n\nSuch datasets can include rare or unique examples that do not represent a broader population, causing the model to learn these as common features. Overfitting results in a model that lacks the ability to generalize, which is critical for practical applications where the input data may vary significantly from the training data.\n\n### Catastrophic Forgetting\n\nCatastrophic forgetting refers to a situation where a neural network, after being fine-tuned with new data, loses the information it had learned during its initial training. This challenge is especially significant in the fine-tuning of LLMs because the new, task-specific training can override the weights and biases that were useful across more general contexts.\n\nFor example, a model trained initially on a broad range of topics might lose its ability to comprehend certain general concepts if it is intensely retrained on a niche subject like legal documents or technical manuals.\n\n### Bias Amplification\n\nBias amplification is when inherent biases in the pre-trained data are intensified. During fine-tuning, a model may not only reflect but also exacerbate biases present in the new training dataset.\n\nFor example, if a dataset for fine-tuning an LLM on job application reviews contains biases against certain demographic groups, the model might amplify this bias, leading to discriminatory behavior in automated screening processes. This underscores the need for careful selection of datasets to avoid reinforcing harmful stereotypes or unfair practices in model outputs.\n\n### Hyperparameter Tuning Complexity\n\nHyperparameters, such as learning rate, batch size, and the number of epochs during which the model is trained, have a major impact on the model’s performance. These parameters need to be carefully adjusted to strike a balance between learning efficiently and avoiding overfitting. The optimal settings for hyperparameters vary between different tasks and datasets.\n\nThe process of identifying the right hyperparameter settings is time-consuming and computationally expensive, requiring extensive use of resources to run numerous training cycles. However, standardized methods, frameworks, and tools for LLM tuning are emerging, which aim to make this process easier.\n\n## LLM Fine-Tuning Best Practices\n\nHere are some of the measures you can take to ensure an effective LLM fine-tuning process.\n\n### Start with a Small Model\n\nBeginning with a smaller model can simplify the fine-tuning process. Smaller models require less computational power and memory, allowing for faster experimentation and iteration. This approach is particularly beneficial when resources are limited. Once the process is optimized on a smaller scale, the insights gained can be applied to fine-tune larger models.\n\n### Experiment with Different Data Formats\n\nExperimenting with various data formats can significantly enhance the effectiveness of fine-tuning. By including diverse input types—such as structured data, unstructured text, images, or even tabular data—models can learn to handle a broader range of real-world scenarios. This helps build versatility in the model’s responses, ensuring it performs well across different contexts and input variations.\n\n### Start with Subsets of Data\n\nStarting with fine-tuning on smaller subsets of the dataset allows for quicker iterations and helps identify potential issues early in the training process. By gradually scaling up to the full dataset, you can fine-tune hyperparameters and make necessary adjustments without expending excessive resources.\n\n### Ensure the Dataset Is High-Quality\n\nThe dataset should be representative of the specific task and domain to ensure the model learns the relevant patterns and nuances. High-quality data minimizes noise and errors, allowing the model to generate more accurate and reliable outputs. Investing time in curating and cleaning the dataset ensures improved model performance and generalization capabilities.\n\n### Use Hyperparameters to Optimize Performance\n\nHyperparameter tuning is vital for optimizing the performance of fine-tuned models. Key parameters like learning rate, batch size, and the number of epochs must be adjusted to balance learning efficiency and overfitting prevention. Systematic experimentation with different hyperparameter values can reveal the optimal settings, leading to improvements in model accuracy and reliability.\n\n## Building LLM Applications with Acorn\n\nVisit [https://gptscript.ai](https://gptscript.ai) to download GPTScript and start building today. As we expand on the capabilities with GPTScript, we are also expanding our list of tools. With these tools, you can create any application imaginable: check out [tools.gptscript.ai](https://tools.gptscript.ai/) to get started.\n\n## See Additional Guides on Key Machine Learning Topics\n\nTogether with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of [machine learning](https://www.aporia.com/learn/machine-learning-model/machine-learning-models-use-cases-operations/).\n\n### [Advanced Threat Protection](https://www.cynet.com/advanced-threat-protection/advanced-threat-protection-a-real-time-threat-killer-machine/)\n\n*Authored by Cynet*\n\n- [Advanced Threat Protection: A Real-Time Threat Killer Machine](https://www.cynet.com/advanced-threat-protection/advanced-threat-protection-a-real-time-threat-killer-machine/)\n- [Advanced Threat Detection: Catch & Eliminate Sneak Attacks](https://www.cynet.com/advanced-threat-protection/advanced-threat-detection-stopping-advanced-attacks-in-their-tracks/)\n- [What is Network Analytics? From Detection to Active Prevention](https://www.cynet.com/advanced-threat-protection/network-analytics-from-detection-to-active-prevention/)\n\n### [Multi GPU](https://www.run.ai/guides/multi-gpu)\n\n*Authored by Run.AI*\n\n- [Multi GPU: An In-Depth Look](https://www.run.ai/guides/multi-gpu)\n- [Keras Multi GPU: A Practical Guide](https://www.run.ai/guides/multi-gpu/keras-multi-gpu-a-practical-guide)\n- [How to Build Your GPU Cluster: Process and Hardware Options](https://www.run.ai/guides/multi-gpu/gpu-clusters)\n\n### [Best LLM](https://www.acorn.io/resources/learning-center/best-llm)\n\n*Authored by Acorn*\n\n- [Best LLM: Benchmarks, Leaderboards, & the World’s 8 Smartest LLMs](https://www.acorn.io/resources/learning-center/best-llm)\n- [Leaderboard of LLM Leaderboards: Top 7 LLM Listings & Their Criteria](https://www.acorn.io/resources/learning-center/llm-leaderboards)\n- [Open LLM Leaderboard: Benchmarks, Model Types & Filters Explained](https://www.acorn.io/resources/learning-center/open-llm-leaderboard#other-filter-options-in-the-open-llm-leaderboard)\n\n## Related Articles\n\n- [AI Copilots: Enterprise Use Cases and Key Considerations](https://www.acorn.io/resources/learning-center/ai-copilots/)\n- [Parameter-Efficient Fine-Tuning (PEFT): The Basics and a Quick Tutorial](https://www.acorn.io/resources/learning-center/parameter-efficient-fine-tuning-peft/)\n- [Aisera: Overview of Platform, Solutions, Pros and Cons](https://www.acorn.io/resources/learning-center/aisera/)\n- [Fine-Tuning Llama 2 with Hugging Face PEFT Library](https://www.acorn.io/resources/learning-center/fine-tuning-llama-2/)\n- [Prompt Engineering in ChatGPT: 9 Proven Techniques](https://www.acorn.io/resources/learning-center/prompt-engineering-in-chatgpt/)\n- [LLM Application Development: Tutorial & 7 Steps to Production Apps](https://www.acorn.io/resources/learning-center/llm-application-development/)\n- [Open LLM Leaderboard: Benchmarks, Model Types & Filters Explained](https://www.acorn.io/resources/learning-center/open-llm-leaderboard/)\n- [Best LLM: Benchmarks, Leaderboards, & the World’s 8 Smartest LLMs](https://www.acorn.io/resources/learning-center/best-llm/)\n\nMenu\n\nMenu Close\n\n- [About Us](https://www.acorn.io/about-us/)\n- [Contact Us](https://www.acorn.io/contact/)\n- [Tutorials](/resources/tutorials)\n- [Blog](/resources/blog)\n- [Events](/events)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [Open Source](https://www.acorn.io/resources/blog/open-source/)\n\nTo unsubscribe at any time please see our [Privacy Policy](https://www.acorn.io/privacy-policy).\n\n#### Get Started with GPTScript\n\n[Install GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release)\n\nCopyright © 2024. All rights reserved. Acorn Labs, Inc.\n\n[Terms of Service](https://www.acorn.io/terms-of-use)\n\n- [GitHub](https://github.com/gptscript-ai/gptscript)\n- [Twitter](https://x.com/acornlabs)\n- [YouTube](https://www.youtube.com/c/AcornLabs)\n- [LinkedIn](https://www.linkedin.com/company/acorn-io/)",
      "inputContext": null,
      "llmRequest": {
        "chatCompletion": {
          "messages": [
            {
              "content": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
              "role": "system"
            },
            {
              "content": "[![Acorn Labs](https://www.acorn.io/wp-content/uploads/2024/10/acorn_logo_h_5b9f9fcaf6.svg)](https://www.acorn.io/)\n\n![](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2020%2014'%3E%3C/svg%3E)\n\nMenu Close\n\n- [Resources](https://www.acorn.io/resources/)\n  \n  - [Resources](https://www.acorn.io/resources/)\n    \n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Tutorials](/resources/tutorials)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Blog](/resources/blog)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Tools](http://tools.gptscript.ai/)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      GitHub](https://github.com/gptscript-ai/gptscript)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Events](/events)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Docs](https://docs.gptscript.ai/)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Discord](https://discord.com/invite/9sSf4UyAMC)\n  - [Learning Center](/resources/learning-center)\n    \n    - [Models](https://www.acorn.io/resources/learning-center/category/models/)\n      \n      - [OpenAI GPT-4](https://www.acorn.io/resources/learning-center/openai/)\n      - [Anthropic Claude](https://www.acorn.io/resources/learning-center/anthropic-claude/)\n      - [Cohere AI](https://www.acorn.io/resources/learning-center/cohere-ai/)\n      - [Google Gemini](https://www.acorn.io/resources/learning-center/google-gemini/)\n      - [Meta LLaMa](https://www.acorn.io/resources/learning-center/meta-llama/)\n      - [Mistral](https://www.acorn.io/resources/learning-center/mistral-ai/)\n      - [Mistral 7B](https://www.acorn.io/resources/learning-center/mistral-7b/)\n    - [Tools and Topics](https://www.acorn.io/resources/learning-center/category/tools-and-topics/)\n      \n      - [Fine-Tuning LLMs](https://www.acorn.io/resources/learning-center/fine-tuning-llm/)\n      - [Generative AI](https://www.acorn.io/resources/learning-center/generative-ai-applications/)\n      - [AI Agents](https://www.acorn.io/resources/learning-center/ai-agents/)\n      - [Claude API](https://www.acorn.io/resources/learning-center/claude-api/)\n      - [Gemini API](https://www.acorn.io/resources/learning-center/google-gemini-api/)\n      - [LLM Application Development](https://www.acorn.io/resources/learning-center/llm-application-development/)\n      - [LLM Security](https://www.acorn.io/resources/learning-center/llm-security/)\n      - [Prompt Engineering](https://www.acorn.io/resources/learning-center/prompt-engineering/)\n    - [Use Cases](https://www.acorn.io/resources/learning-center/category/use-cases/)\n      \n      - [Retrieval Augmented Generation (RAG)](https://www.acorn.io/resources/learning-center/retrieval-augmented-generation/)\n      - [AI Copilots](https://www.acorn.io/resources/learning-center/ai-copilots/)\n      - [AI Image](https://www.acorn.io/resources/learning-center/ai-image-generation/)\n      - [AI Video Generators](https://www.acorn.io/resources/learning-center/ai-video-generators/)\n      - [AI Summarization](https://www.acorn.io/resources/learning-center/ai-summarization/)\n      - [Code Interpreter](https://www.acorn.io/resources/learning-center/code-interpreter/)\n    \n    [Explore All Articles](/resources/learning-center)\n- [Docs](https://docs.gptscript.ai/)\n- [Tools](http://tools.gptscript.ai/)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [GitHub](https://github.com/gptscript-ai/gptscript)\n- [Company](https://www.acorn.io/about-us/)\n\n[Try GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release)\n\n[Request a Demo](/contact)\n\n[Learning Center](/resources/learning-center)\n\n# Fine-Tuning LLMs: Top 6 Methods, Challenges and Best Practices\n\n###### June 6, 2024 by acorn labs\n\n## What Does It Mean to Fine-Tune LLMs?\n\nFine-tuning Large Language Models (LLMs) involves adjusting pre-trained models on specific datasets to enhance performance for particular tasks. This process begins after general training ends. Users provide the model with a more focused dataset, which may include industry-specific terminology or task-focused interactions, with the objective of helping the model generate more relevant responses for a specific use case.\n\nFine-tuning allows the model to adapt its pre-existing weights and biases to fit specific problems better. This results in improved accuracy and relevance in outputs, making LLMs more effective in practical, specialized applications than their broadly trained counterparts. While fine-tuning can be highly computationally intensive, new techniques like Parameter-Efficient Fine-Tuning (PEFT) are making it much more efficient and possible to run even on consumer hardware.\n\nFine-tuning can be performed both on open source LLMs, such as Meta LLaMA and Mistral models, and on some commercial LLMs, if this capability is offered by the model’s developer. *(Learn more in our detailed guide to [fine tuning Llama 2](https://www.acorn.io/resources/learning-center/fine-tuning-llama-2))*. For example, OpenAI allows fine tuning for GPT-3.5 and GPT-4. This is part of an extensive series of guides about [machine learning](https://www.aporia.com/learn/machine-learning-model/machine-learning-models-use-cases-operations/).\n\n## Fine-Tuning vs. Embeddings vs. Prompt Engineering\n\n**Fine-tuning** is a method where a pre-trained model is further trained (or fine tuned) on a new dataset specific to a particular task. This technique involves adjusting the weights across all layers of the model, based on the new data. It allows the model to specifically cater to nuanced tasks and often results in higher performance for specialized applications.\n\n**Embeddings** refer to dense vector representations of words or phrases, which are typically obtained during the initial training of a model. Instead of adjusting the entire model, embeddings can be extracted and used as static input features for various downstream tasks. This approach does not modify the pre-trained model but leverages the learned representations. It’s generally faster and less resource-intensive than fine-tuning.\n\n**Prompt engineering** is another way to adjust LLMs to specific tasks. Adding more context, examples, or even entire documents and rich media, to LLM prompts can cause models to provide much more nuanced and relevant responses to specific tasks. Prompt engineering is considered more limited than fine-tuning, but is also much less technically complex and is not computationally intensive.\n\n***Learn more in our detailed guide to LLM fine tuning vs embedding (coming soon)***\n\n## When Does Your Business Need a Fine-Tuned Model?\n\nHere are a few primary use cases for fine-tuned LLMs:\n\n### Specificity and Relevance\n\nA fine-tuned model excels in providing highly specific and relevant outputs tailored to your business’s unique needs. Unlike general models, which offer broad responses, fine-tuning adapts the model to understand industry-specific terminology and nuances. This can be particularly beneficial for specialized industries like legal, medical, or technical fields where precise language and contextual understanding are crucial.\n\n### Improved Accuracy\n\nFine-tuning significantly enhances the accuracy of a language model by allowing it to adapt to the specific patterns and requirements of your business data. When a model is fine-tuned, it learns from a curated dataset that mirrors the particular tasks and language your business encounters. This focused learning process refines the model’s ability to generate precise and contextually appropriate responses, reducing errors and increasing the reliability of the outputs.\n\n### Data Privacy and Security\n\nIn many industries, maintaining data privacy and security is paramount. By fine-tuning a language model on proprietary or sensitive data, businesses can ensure that their unique datasets are not exposed to third-party risks associated with general model training environments. Fine-tuning can be conducted on-premises or within secure environments, keeping data control in-house.\n\n### Customized Interactions\n\nBusinesses that require highly personalized customer interactions can significantly benefit from fine-tuned models. These models can be trained to understand and respond to customer queries with a level of customization that aligns with the brand’s voice and customer service protocols. For instance, a fine-tuned model in a retail business can understand product-specific inquiries, offer personalized recommendations, understand company policies, and handle complex service issues more effectively than a general model.\n\n## Top 6 LLM Fine-Tuning Methods\n\nHere are some of the ways that large language models can be fine tuned.\n\n### 1. Instruction Fine-Tuning\n\nInstruction fine-tuning involves training a model using examples that demonstrate how it should respond to specific queries. For instance, to improve summarization skills, a dataset with instructions like \"summarize this text\" followed by the actual text is used.\n\nThis method helps the model learn to follow specific instructions and improve its performance in targeted tasks by understanding the expected outputs from given prompts. This approach is particularly useful for enhancing the model’s ability to handle various task-specific instructions effectively.\n\n### 2. Parameter-Efficient Fine-Tuning (PEFT)\n\nPEFT updates only a small subset of the model’s parameters during training, significantly reducing the memory and computational requirements compared to full fine-tuning. Techniques like LoRA (Low-Rank Adaptation) and QLoRA (Quantized Low-Rank Adaptation) can reduce the number of trainable parameters by thousands of times.\n\nThis method helps manage hardware limitations and prevents the phenomenon of ‘catastrophic forgetting’, maintaining the model’s original knowledge while adapting to new tasks. By focusing on specific components, PEFT makes the fine-tuning process more efficient and cost-effective, especially for large models.\n\n### 3. Task-Specific Fine-Tuning\n\nTask-specific fine-tuning focuses on adjusting a pre-trained model to excel in a particular task or domain using a dedicated dataset. This method typically requires more data and time than transfer learning but achieves higher performance in specific tasks, such as translation or sentiment analysis.\n\nDespite its effectiveness, it can lead to catastrophic forgetting, where the model loses proficiency in tasks it was previously trained on. However, by tailoring the model to specific requirements, task-specific fine-tuning ensures high accuracy and relevance for specialized applications.\n\n### 4. Transfer Learning\n\nTransfer learning leverages a model trained on a broad, general-purpose dataset and adapts it to specific tasks using task-specific data. This method is useful when data or resources are limited, as it builds on the knowledge already embedded in the pre-trained model, offering improved learning rates and accuracy with less training time compared to training from scratch.\n\nTransfer learning enables the efficient reuse of models like GPT or BERT for new applications, providing a strong foundation for further customization.\n\n***Learn more in our detailed guide to fine tuning vs transfer learning (coming soon)***\n\n### 5. Multi-Task Learning\n\nMulti-task learning trains a model on a dataset containing examples for multiple tasks, such as summarization, code translation, and entity recognition. This approach helps the model improve performance across different tasks simultaneously and avoids catastrophic forgetting.\n\nHowever, this method requires a large amount of diverse data, which can be challenging to assemble. The comprehensive training enables the model to handle various tasks proficiently, making it suitable for environments where versatile performance is necessary.\n\n### 6. Sequential Fine-Tuning\n\nSequential fine-tuning adapts a model to a series of related tasks in stages. For example, a general language model might first be fine-tuned for medical language and subsequently for pediatric cardiology. This method ensures the model retains its performance across various specialized domains, building on each successive fine-tuning step to refine its capabilities further.\n\nBy sequentially adapting to increasingly specific datasets, the model can achieve high proficiency in niche areas while maintaining a broad understanding of the general domain.\n\n## What Is Retrieval Augmented Generation (RAG)?\n\nRetrieval Augmented Generation (RAG) is a technique that combines natural language generation with information retrieval to enhance a model’s outputs with up-to-date and contextually relevant information. RAG integrates external knowledge sources, ensuring that the language model provides accurate and current responses. This method is particularly useful for tasks requiring precise, timely information, as it allows continuous updates and easy management of the knowledge base, avoiding the rigidity of traditional fine-tuning methods.\n\nRAG systems can dynamically retrieve information during generation, making them highly adaptable to changing data and capable of delivering more relevant and informed outputs. This technique is beneficial for applications where the accuracy and freshness of information are critical, such as customer support, content creation, and research. By leveraging RAG, businesses can ensure their language models remain current and provide high-quality responses that are well-grounded in the latest information available.\n\nNotable examples of the use of RAG are the [AI Overviews](https://blog.google/products/search/generative-ai-google-search-may-2024/) feature in Google search, and [Microsoft Copilot in Bing](https://www.bing.com/chat?form=CONVRD), both of which extract data from a live index of the Internet and use it as an input for LLM responses.\n\n## How to Choose a Pre-Trained Model for Fine-Tuning\n\nHere’s an overview of the process of identifying an existing LLM for fine-tuning.\n\n### Define the Task\n\nBefore fine-tuning, clearly define the model’s intended task. Understanding the task’s requirements helps in selecting a model whose pre-trained capabilities align closely with the end objectives. For example, it may involve classification, regression, or generative tasks.\n\nAn accurate task definition also aids in determining the necessary data scope for model fine-tuning. This can prevent potential performance degradation due to underfitting or overfitting during the fine-tuning phase.\n\n### Understand the Model Architecture\n\nGet familiar with different model architectures to select the most suitable one for your task. Each architecture has strengths and limitations based on its design principles, layers, and the type of data it was initially trained on.\n\nUnderstanding these characteristics can significantly impact the success of fine-tuning, as certain architectures might be more compatible with the nature of your specific tasks.\n\n### Assess Strengths and Weaknesses\n\nEvaluate the strengths and weaknesses of the model options. Some models may excel at handling text-based tasks while others may be optimized for voice or image recognition tasks. Standardized benchmarks, which you can find on LLM leaderboards, can help compare models on parameters relevant to your project.\n\nAdditionally, consider the model’s performance trade-offs such as accuracy, processing speed, and memory usage, which can affect the practical deployment of the fine tuned model in real-world applications.\n\n### Match with Task Requirements\n\nEnsure the pre-trained model’s capabilities match the demands of the task. This involves comparing the model’s training data, learning capabilities, and output formats with what’s needed for your use case. A close match between the model’s training conditions and your task’s requirements can enhance the effectiveness of the re-training process.\n\n***Related content: Read our guide to fine tuning LLM tutorial (coming soon)***\n\n## Challenges and Limitations of LLM Fine-Tuning\n\nHere are some of the challenges involved in fine-tuning large language models.\n\n### Overfitting\n\nOverfitting occurs when a model is trained so closely to the nuances of a specific dataset that it performs exceptionally well on that data but poorly on any data it hasn’t seen before. This is particularly problematic in fine-tuning because the datasets used are generally smaller and more specialized than those used in initial broad training phases.\n\nSuch datasets can include rare or unique examples that do not represent a broader population, causing the model to learn these as common features. Overfitting results in a model that lacks the ability to generalize, which is critical for practical applications where the input data may vary significantly from the training data.\n\n### Catastrophic Forgetting\n\nCatastrophic forgetting refers to a situation where a neural network, after being fine-tuned with new data, loses the information it had learned during its initial training. This challenge is especially significant in the fine-tuning of LLMs because the new, task-specific training can override the weights and biases that were useful across more general contexts.\n\nFor example, a model trained initially on a broad range of topics might lose its ability to comprehend certain general concepts if it is intensely retrained on a niche subject like legal documents or technical manuals.\n\n### Bias Amplification\n\nBias amplification is when inherent biases in the pre-trained data are intensified. During fine-tuning, a model may not only reflect but also exacerbate biases present in the new training dataset.\n\nFor example, if a dataset for fine-tuning an LLM on job application reviews contains biases against certain demographic groups, the model might amplify this bias, leading to discriminatory behavior in automated screening processes. This underscores the need for careful selection of datasets to avoid reinforcing harmful stereotypes or unfair practices in model outputs.\n\n### Hyperparameter Tuning Complexity\n\nHyperparameters, such as learning rate, batch size, and the number of epochs during which the model is trained, have a major impact on the model’s performance. These parameters need to be carefully adjusted to strike a balance between learning efficiently and avoiding overfitting. The optimal settings for hyperparameters vary between different tasks and datasets.\n\nThe process of identifying the right hyperparameter settings is time-consuming and computationally expensive, requiring extensive use of resources to run numerous training cycles. However, standardized methods, frameworks, and tools for LLM tuning are emerging, which aim to make this process easier.\n\n## LLM Fine-Tuning Best Practices\n\nHere are some of the measures you can take to ensure an effective LLM fine-tuning process.\n\n### Start with a Small Model\n\nBeginning with a smaller model can simplify the fine-tuning process. Smaller models require less computational power and memory, allowing for faster experimentation and iteration. This approach is particularly beneficial when resources are limited. Once the process is optimized on a smaller scale, the insights gained can be applied to fine-tune larger models.\n\n### Experiment with Different Data Formats\n\nExperimenting with various data formats can significantly enhance the effectiveness of fine-tuning. By including diverse input types—such as structured data, unstructured text, images, or even tabulardata—models can learn to handle a broader range of real-world scenarios. This helps build versatility in the model’s responses, ensuring it performs well across different contexts and input variations.\n\n### Start with Subsets of Data\n\nStarting with fine-tuning on smaller subsets of the dataset allows for quicker iterations and helps identify potential issues early in the training process. By gradually scaling up to the full dataset, you can fine-tune hyperparameters and make necessary adjustments without expending excessive resources.\n\n### Ensure the Dataset Is High-Quality\n\nThe dataset should be representative of the specific task and domain to ensure the model learns the relevant patterns and nuances. High-quality data minimizes noise and errors, allowing the model to generate more accurate and reliable outputs. Investing time in curating and cleaning the dataset ensures improved model performance and generalization capabilities.\n\n### Use Hyperparameters to Optimize Performance\n\nHyperparameter tuning is vital for optimizing the performance of fine-tuned models. Key parameters like learning rate, batch size, and the number of epochs must be adjusted to balance learning efficiency and overfitting prevention. Systematic experimentation with different hyperparameter values can reveal the optimal settings, leading to improvements in model accuracy and reliability.\n\n## Building LLM Applications with Acorn\n\nVisit [https://gptscript.ai](https://gptscript.ai) to download GPTScript and start building today. As we expand on the capabilities with GPTScript, we are also expanding our list of tools. With these tools, you can create any application imaginable: check out [tools.gptscript.ai](https://tools.gptscript.ai/) to get started.\n\n## See Additional Guides on Key Machine Learning Topics\n\nTogether with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of [machine learning](https://www.aporia.com/learn/machine-learning-model/machine-learning-models-use-cases-operations/).\n\n### [Advanced Threat Protection](https://www.cynet.com/advanced-threat-protection/advanced-threat-protection-a-real-time-threat-killer-machine/)\n\n*Authored by Cynet*\n\n- [Advanced Threat Protection: A Real-Time Threat Killer Machine](https://www.cynet.com/advanced-threat-protection/advanced-threat-protection-a-real-time-threat-killer-machine/)\n- [Advanced Threat Detection: Catch & Eliminate Sneak Attacks](https://www.cynet.com/advanced-threat-protection/advanced-threat-detection-stopping-advanced-attacks-in-their-tracks/)\n- [What is Network Analytics? From Detection to Active Prevention](https://www.cynet.com/advanced-threat-protection/network-analytics-from-detection-to-active-prevention/)\n\n### [Multi GPU](https://www.run.ai/guides/multi-gpu)\n\n*Authored by Run.AI*\n\n- [Multi GPU: An In-Depth Look](https://www.run.ai/guides/multi-gpu)\n- [Keras Multi GPU: A Practical Guide](https://www.run.ai/guides/multi-gpu/keras-multi-gpu-a-practical-guide)\n- [How to Build Your GPU Cluster: Process and Hardware Options](https://www.run.ai/guides/multi-gpu/gpu-clusters)\n\n### [Best LLM](https://www.acorn.io/resources/learning-center/best-llm)\n\n*Authored by Acorn*\n\n- [Best LLM: Benchmarks, Leaderboards, & the World’s 8 Smartest LLMs](https://www.acorn.io/resources/learning-center/best-llm)\n- [Leaderboard of LLM Leaderboards: Top 7 LLM Listings & Their Criteria](https://www.acorn.io/resources/learning-center/llm-leaderboards)\n- [Open LLM Leaderboard: Benchmarks, Model Types & Filters Explained](https://www.acorn.io/resources/learning-center/open-llm-leaderboard#other-filter-options-in-the-open-llm-leaderboard)\n\n## Related Articles\n\n- [AI Copilots: Enterprise Use Cases and Key Considerations](https://www.acorn.io/resources/learning-center/ai-copilots/)\n- [Parameter-Efficient Fine-Tuning (PEFT): The Basics and a Quick Tutorial](https://www.acorn.io/resources/learning-center/parameter-efficient-fine-tuning-peft/)\n- [Aisera: Overview of Platform, Solutions, Pros and Cons](https://www.acorn.io/resources/learning-center/aisera/)\n- [Fine-Tuning Llama 2 with Hugging Face PEFT Library](https://www.acorn.io/resources/learning-center/fine-tuning-llama-2/)\n- [Prompt Engineering in ChatGPT: 9 Proven Techniques](https://www.acorn.io/resources/learning-center/prompt-engineering-in-chatgpt/)\n- [LLM Application Development: Tutorial & 7 Steps to Production Apps](https://www.acorn.io/resources/learning-center/llm-application-development/)\n- [Open LLM Leaderboard: Benchmarks, Model Types & Filters Explained](https://www.acorn.io/resources/learning-center/open-llm-leaderboard/)\n- [Best LLM: Benchmarks, Leaderboards, & the World’s 8 Smartest LLMs](https://www.acorn.io/resources/learning-center/best-llm/)\n\nMenu\n\nMenu Close\n\n- [About Us](https://www.acorn.io/about-us/)\n- [Contact Us](https://www.acorn.io/contact/)\n- [Tutorials](/resources/tutorials)\n- [Blog](/resources/blog)\n- [Events](/events)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [Open Source](https://www.acorn.io/resources/blog/open-source/)\n\nTo unsubscribe at any time please see our [Privacy Policy](https://www.acorn.io/privacy-policy).\n\n#### Get Started with GPTScript\n\n[Install GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release)\n\nCopyright © 2024. All rights reserved. Acorn Labs, Inc.\n\n[Terms of Service](https://www.acorn.io/terms-of-use)\n\n- [GitHub](https://github.com/gptscript-ai/gptscript)\n- [Twitter](https://x.com/acornlabs)\n- [YouTube](https://www.youtube.com/c/AcornLabs)\n- [LinkedIn](https://www.linkedin.com/company/acorn-io/)",
              "role": "user"
            }
          ],
          "model": "llm-mini",
          "temperature": 0
        },
        "toolMapping": {

        }
      },
      "llmResponse": null,
      "output": [
        {
          "content": "# Fine-Tuning LLMs: Top 6 Methods, Challenges and Best Practices\n\n###### June 6, 2024 by acorn labs\n\n## What Does It Mean to Fine-Tune LLMs?\n\nFine-tuning Large Language Models (LLMs) involves adjusting pre-trained models on specific datasets to enhance performance for particular tasks. This process begins after general training ends. Users provide the model with a more focused dataset, which may include industry-specific terminology or task-focused interactions, with the objective of helping the model generate more relevant responses for a specific use case.\n\nFine-tuning allows the model to adapt its pre-existing weights and biases to fit specific problems better. This results in improved accuracy and relevance in outputs, making LLMs more effective in practical, specialized applications than their broadly trained counterparts. While fine-tuning can be highly computationally intensive, new techniques like Parameter-Efficient Fine-Tuning (PEFT) are making it much more efficient and possible to run even on consumer hardware.\n\nFine-tuning can be performed both on open source LLMs, such as Meta LLaMA and Mistral models, and on some commercial LLMs, if this capability is offered by the model’s developer. *(Learn more in our detailed guide to [fine tuning Llama 2](https://www.acorn.io/resources/learning-center/fine-tuning-llama-2))*. For example, OpenAI allows fine tuning for GPT-3.5 and GPT-4. This is part of an extensive series of guides about [machine learning](https://www.aporia.com/learn/machine-learning-model/machine-learning-models-use-cases-operations/).\n\n## Fine-Tuning vs. Embeddings vs. Prompt Engineering\n\n**Fine-tuning** is a method where a pre-trained model is further trained (or fine-tuned) on a new dataset specific to a particular task. This technique involves adjusting the weights across all layers of the model, based on the new data. It allows the model to specifically cater to nuanced tasks and often results in higher performance for specialized applications.\n\n**Embeddings** refer to dense vector representations of words or phrases, which are typically obtained during the initial training of a model. Instead of adjusting the entire model, embeddings can be extracted and used as static input features for various downstream tasks. This approach does not modify the pre-trained model but leverages the learned representations. It’s generally faster and less resource-intensive than fine-tuning.\n\n**Prompt engineering** is another way to adjust LLMs to specific tasks. Adding more context, examples, or even entire documents and rich media, to LLM prompts can cause models to provide much more nuanced and relevant responses to specific tasks. Prompt engineering is considered more limited than fine-tuning, but is also much less technically complex and is not computationally intensive.\n\n***Learn more in our detailed guide to LLM fine tuning vs embedding (coming soon)***\n\n## When Does Your Business Need a Fine-Tuned Model?\n\nHere are a few primary use cases for fine-tuned LLMs:\n\n### Specificity and Relevance\n\nA fine-tuned model excels in providing highly specific and relevant outputs tailored to your business’s unique needs. Unlike general models, which offer broad responses, fine-tuning adapts the model to understand industry-specific terminology and nuances. This can be particularly beneficial for specialized industries like legal, medical, or technical fields where precise language and contextual understanding are crucial.\n\n### Improved Accuracy\n\nFine-tuning significantly enhances the accuracy of a language model by allowing it to adapt to the specific patterns and requirements of your business data. When a model is fine-tuned, it learns from a curated dataset that mirrors the particular tasks and language your business encounters. This focused learning process refines the model’s ability to generate precise and contextually appropriate responses, reducing errors and increasing the reliability of the outputs.\n\n### Data Privacy and Security\n\nIn many industries, maintaining data privacy and security is paramount. By fine-tuning a language model on proprietary or sensitive data, businesses can ensure that their unique datasets are not exposed to third-party risks associated with general model training environments. Fine-tuning can be conducted on-premises or within secure environments, keeping data control in-house.\n\n### Customized Interactions\n\nBusinesses that require highly personalized customer interactions can significantly benefit from fine-tuned models. These models can be trained to understand and respond to customer queries with a level of customization that aligns with the brand’s voice and customer service protocols. For instance, a fine-tuned model in a retail business can understand product-specific inquiries, offer personalized recommendations, understand company policies, and handle complex service issues more effectively than a general model.\n\n## Top 6 LLM Fine-Tuning Methods\n\nHere are some of the ways that large language models can be fine-tuned.\n\n### 1. Instruction Fine-Tuning\n\nInstruction fine-tuning involves training a model using examples that demonstrate how it should respond to specific queries. For instance, to improve summarization skills, a dataset with instructions like \"summarize this text\" followed by the actual text is used.\n\nThis method helps the model learn to follow specific instructions and improve its performance in targeted tasks by understanding the expected outputs from given prompts. This approach is particularly useful for enhancing the model’s ability to handle various task-specific instructions effectively.\n\n### 2. Parameter-Efficient Fine-Tuning (PEFT)\n\nPEFT updates only a small subset of the model’s parameters during training, significantly reducing the memory and computational requirements compared to full fine-tuning. Techniques like LoRA (Low-Rank Adaptation) and QLoRA (Quantized Low-Rank Adaptation) can reduce the number of trainable parameters by thousands of times.\n\nThis method helps manage hardware limitations and prevents the phenomenon of ‘catastrophic forgetting’, maintaining the model’s original knowledge while adapting to new tasks. By focusing on specific components, PEFT makes the fine-tuning process more efficient and cost-effective, especially for large models.\n\n### 3. Task-Specific Fine-Tuning\n\nTask-specific fine-tuning focuses on adjusting a pre-trained model to excel in a particular task or domain using a dedicated dataset. This method typically requires more data and time than transfer learning but achieves higher performance in specific tasks, such as translation or sentiment analysis.\n\nDespite its effectiveness, it can lead to catastrophic forgetting, where the model loses proficiency in tasks it was previously trained on. However, by tailoring the model to specific requirements, task-specific fine-tuning ensures high accuracy and relevance for specialized applications.\n\n### 4. Transfer Learning\n\nTransfer learning leverages a model trained on a broad, general-purpose dataset and adapts it to specific tasks using task-specific data. This method is useful when data or resources are limited, as it builds on the knowledge already embedded in the pre-trained model, offering improved learning rates and accuracy with less training time compared to training from scratch.\n\nTransfer learning enables the efficient reuse of models like GPT or BERT for new applications, providing a strong foundation for further customization.\n\n***Learn more in our detailed guide to fine tuning vs transfer learning (coming soon)***\n\n### 5. Multi-Task Learning\n\nMulti-task learning trains a model on a dataset containing examples for multiple tasks, such as summarization, code translation, and entity recognition. This approach helps the model improve performance across different tasks simultaneously and avoids catastrophic forgetting.\n\nHowever, this method requires a large amount of diverse data, which can be challenging to assemble. The comprehensive training enables the model to handle various tasks proficiently, making it suitable for environments where versatile performance is necessary.\n\n### 6. Sequential Fine-Tuning\n\nSequential fine-tuning adapts a model to a series of related tasks in stages. For example, a general language model might first be fine-tuned for medical language and subsequently for pediatric cardiology. This method ensures the model retains its performance across various specialized domains, building on each successive fine-tuning step to refine its capabilities further.\n\nBy sequentially adapting to increasingly specific datasets, the model can achieve high proficiency in niche areas while maintaining a broad understanding of the general domain.\n\n## What Is Retrieval Augmented Generation (RAG)?\n\nRetrieval Augmented Generation (RAG) is a technique that combines natural language generation with information retrieval to enhance a model’s outputs with up-to-date and contextually relevant information. RAG integrates external knowledge sources, ensuring that the language model provides accurate and current responses. This method is particularly useful for tasks requiring precise, timely information, as it allows continuous updates and easy management of the knowledge base, avoiding the rigidity of traditional fine-tuning methods.\n\nRAG systems can dynamically retrieve information during generation, making them highly adaptable to changing data and capable of delivering more relevant and informed outputs. This technique is beneficial for applications where the accuracy and freshness of information are critical, such as customer support, content creation, and research. By leveraging RAG, businesses can ensure their language models remain current and provide high-quality responses that are well-grounded in the latest information available.\n\nNotable examples of the use of RAG are the [AI Overviews](https://blog.google/products/search/generative-ai-google-search-may-2024/) feature in Google search, and [Microsoft Copilot in Bing](https://www.bing.com/chat?form=CONVRD), both of which extract data from a live index of the Internet and use it as an input for LLM responses.\n\n## How to Choose a Pre-Trained Model for Fine-Tuning\n\nHere’s an overview of the process of identifying an existing LLM for fine-tuning.\n\n### Define the Task\n\nBefore fine-tuning, clearly define the model’s intended task. Understanding the task’s requirements helps in selecting a model whose pre-trained capabilities align closely with the end objectives. For example, it may involve classification, regression, or generative tasks.\n\nAn accurate task definition also aids in determining the necessary data scope for model fine-tuning. This can prevent potential performance degradation due to underfitting or overfitting during the fine-tuning phase.\n\n### Understand the Model Architecture\n\nGet familiar with different model architectures to select the most suitable one for your task. Each architecture has strengths and limitations based on its design principles, layers, and the type of data it was initially trained on.\n\nUnderstanding these characteristics can significantly impact the success of fine-tuning, as certain architectures might be more compatible with the nature of your specific tasks.\n\n### Assess Strengths and Weaknesses\n\nEvaluate the strengths and weaknesses of the model options. Some models may excel at handling text-based tasks while others may be optimized for voice or image recognition tasks. Standardized benchmarks, which you can find on LLM leaderboards, can help compare models on parameters relevant to your project.\n\nAdditionally, consider the model’s performance trade-offs such as accuracy, processing speed, and memory usage, which can affect the practical deployment of the fine-tuned model in real-world applications.\n\n### Match with Task Requirements\n\nEnsure the pre-trained model’s capabilities match the demands of the task. This involves comparing the model’s training data, learning capabilities, and output formats with what’s needed for your use case. A close match between the model’s training conditions and your task’s requirements can enhance the effectiveness of the re-training process.\n\n***Related content: Read our guide to fine tuning LLM tutorial (coming soon)***\n\n## Challenges and Limitations of LLM Fine-Tuning\n\nHere are some of the challenges involved in fine-tuning large language models.\n\n### Overfitting\n\nOverfitting occurs when a model is trained so closely to the nuances of a specific dataset that it performs exceptionally well on that data but poorly on any data it hasn’t seen before. This is particularly problematic in fine-tuning because the datasets used are generally smaller and more specialized than those used in initial broad training phases.\n\nSuch datasets can include rare or unique examples that do not represent a broader population, causing the model to learn these as common features. Overfitting results in a model that lacks the ability to generalize, which is critical for practical applications where the input data may vary significantly from the training data.\n\n### Catastrophic Forgetting\n\nCatastrophic forgetting refers to a situation where a neural network, after being fine-tuned with new data, loses the information it had learned during its initial training. This challenge is especially significant in the fine-tuning of LLMs because the new, task-specific training can override the weights and biases that were useful across more general contexts.\n\nFor example, a model trained initially on a broad range of topics might lose its ability to comprehend certain general concepts if it is intensely retrained on a niche subject like legal documents or technical manuals.\n\n### Bias Amplification\n\nBias amplification is when inherent biases in the pre-trained data are intensified. During fine-tuning, a model may not only reflect but also exacerbate biases present in the new training dataset.\n\nFor example, if a dataset for fine-tuning an LLM on job application reviews contains biases against certain demographic groups, the model might amplify this bias, leading to discriminatory behavior in automated screening processes. This underscores the need for careful selection of datasets to avoid reinforcing harmful stereotypes or unfair practices in model outputs.\n\n### Hyperparameter Tuning Complexity\n\nHyperparameters, such as learning rate, batch size, and the number of epochs during which the model is trained, have a major impact on the model’s performance. These parameters need to be carefully adjusted to strike a balance between learning efficiently and avoiding overfitting. The optimal settings for hyperparameters vary between different tasks and datasets.\n\nThe process of identifying the right hyperparameter settings is time-consuming and computationally expensive, requiring extensive use of resources to run numerous training cycles. However, standardized methods, frameworks, and tools for LLM tuning are emerging, which aim to make this process easier.\n\n## LLM Fine-Tuning Best Practices\n\nHere are some of the measures you can take to ensure an effective LLM fine-tuning process.\n\n### Start with a Small Model\n\nBeginning with a smaller model can simplify the fine-tuning process. Smaller models require less computational power and memory, allowing for faster experimentation and iteration. This approach is particularly beneficial when resources are limited. Once the process is optimized on a smaller scale, the insights gained can be applied to fine-tune larger models.\n\n### Experiment with Different Data Formats\n\nExperimenting with various data formats can significantly enhance the effectiveness of fine-tuning. By including diverse input types—such as structured data, unstructured text, images, or even tabular data—models can learn to handle a broader range of real-world scenarios. This helps build versatility in the model’s responses, ensuring it performs well across different contexts and input variations.\n\n### Start with Subsets of Data\n\nStarting with fine-tuning on smaller subsets of the dataset allows for quicker iterations and helps identify potential issues early in the training process. By gradually scaling up to the full dataset, you can fine-tune hyperparameters and make necessary adjustments without expending excessive resources.\n\n### Ensure the Dataset Is High-Quality\n\nThe dataset should be representative of the specific task and domain to ensure the model learns the relevant patterns and nuances. High-quality data minimizes noise and errors, allowing the model to generate more accurate and reliable outputs. Investing time in curating and cleaning the dataset ensures improved model performance and generalization capabilities.\n\n### Use Hyperparameters to Optimize Performance\n\nHyperparameter tuning is vital for optimizing the performance of fine-tuned models. Key parameters like learning rate, batch size, and the number of epochs must be adjusted to balance learning efficiency and overfitting prevention. Systematic experimentation with different hyperparameter values can reveal the optimal settings, leading to improvements in model accuracy and reliability.\n\n## Building LLM Applications with Acorn\n\nVisit [https://gptscript.ai](https://gptscript.ai) to download GPTScript and start building today. As we expand on",
          "subCalls": null
        }
      ],
      "start": "2024-11-28T00:20:58.725753347Z",
      "tool": {
        "description": "Removes extra header, footer, and navigation content from the markdown version of webpages",
        "id": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner",
        "instructions": "The following content is a scraped webpage converted to markdown. Please remove any content that came from the website header, footer, or navigation. The output should focus on just the main content body of the page. Maintain the markdown format, including any links or images.",
        "internalPrompt": null,
        "localTools": {
          "website markdown content cleaner": "/otto8-tools/website-cleaner/tool.gpt:Website Markdown Content Cleaner"
        },
        "modelName": "llm-mini",
        "name": "Website Markdown Content Cleaner",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/website-cleaner/tool.gpt"
        },
        "workingDir": "/otto8-tools/website-cleaner"
      },
      "toolResults": 0,
      "type": "callProgress",
      "usage": {

      }
    },
    "1732750509": {
      "chatResponseCached": false,
      "currentAgent": {

      },
      "displayText": "Running sys.daemon",
      "end": "2024-11-28T00:20:58.726110638Z",
      "id": "1732750509",
      "input": "",
      "inputContext": null,
      "llmRequest": null,
      "llmResponse": null,
      "output": [
        {
          "content": "http://127.0.0.1:10510",
          "subCalls": null
        }
      ],
      "start": "2024-11-28T00:20:58.72589543Z",
      "tool": {
        "description": "Model provider for Otto8",
        "id": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8",
        "instructions": "#!sys.daemon /usr/bin/env python3 ${GPTSCRIPT_TOOL_DIR}/main.py",
        "internalPrompt": null,
        "localTools": {
          "otto8": "/otto8-tools/otto8-model-provider/tool.gpt:Otto8"
        },
        "modelName": "gpt-4o",
        "modelProvider": true,
        "name": "Otto8",
        "source": {
          "lineNo": 1,
          "location": "/otto8-tools/otto8-model-provider/tool.gpt"
        },
        "workingDir": "/otto8-tools/otto8-model-provider"
      },
      "toolCategory": "provider",
      "toolResults": 0,
      "type": "callFinish",
      "usage": {

      }
    }
  },
  "spec": {
    "synchronous": true,
    "threadName": "t1-ks1fjxp4",
    "input": "[![Acorn Labs](https://www.acorn.io/wp-content/uploads/2024/10/acorn_logo_h_5b9f9fcaf6.svg)](https://www.acorn.io/)\n\n![](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2020%2014'%3E%3C/svg%3E)\n\nMenu Close\n\n- [Resources](https://www.acorn.io/resources/)\n  \n  - [Resources](https://www.acorn.io/resources/)\n    \n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Tutorials](/resources/tutorials)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Blog](/resources/blog)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Tools](http://tools.gptscript.ai/)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      GitHub](https://github.com/gptscript-ai/gptscript)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Events](/events)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Docs](https://docs.gptscript.ai/)\n    - [![ ](data:image/svg+xml,%3Csvg%20xmlns='http://www.w3.org/2000/svg'%20viewBox='0%200%2022%200'%3E%3C/svg%3E)\n      \\\n      Discord](https://discord.com/invite/9sSf4UyAMC)\n  - [Learning Center](/resources/learning-center)\n    \n    - [Models](https://www.acorn.io/resources/learning-center/category/models/)\n      \n      - [OpenAI GPT-4](https://www.acorn.io/resources/learning-center/openai/)\n      - [Anthropic Claude](https://www.acorn.io/resources/learning-center/anthropic-claude/)\n      - [Cohere AI](https://www.acorn.io/resources/learning-center/cohere-ai/)\n      - [Google Gemini](https://www.acorn.io/resources/learning-center/google-gemini/)\n      - [Meta LLaMa](https://www.acorn.io/resources/learning-center/meta-llama/)\n      - [Mistral](https://www.acorn.io/resources/learning-center/mistral-ai/)\n      - [Mistral 7B](https://www.acorn.io/resources/learning-center/mistral-7b/)\n    - [Tools and Topics](https://www.acorn.io/resources/learning-center/category/tools-and-topics/)\n      \n      - [Fine-Tuning LLMs](https://www.acorn.io/resources/learning-center/fine-tuning-llm/)\n      - [Generative AI](https://www.acorn.io/resources/learning-center/generative-ai-applications/)\n      - [AI Agents](https://www.acorn.io/resources/learning-center/ai-agents/)\n      - [Claude API](https://www.acorn.io/resources/learning-center/claude-api/)\n      - [Gemini API](https://www.acorn.io/resources/learning-center/google-gemini-api/)\n      - [LLM Application Development](https://www.acorn.io/resources/learning-center/llm-application-development/)\n      - [LLM Security](https://www.acorn.io/resources/learning-center/llm-security/)\n      - [Prompt Engineering](https://www.acorn.io/resources/learning-center/prompt-engineering/)\n    - [Use Cases](https://www.acorn.io/resources/learning-center/category/use-cases/)\n      \n      - [Retrieval Augmented Generation (RAG)](https://www.acorn.io/resources/learning-center/retrieval-augmented-generation/)\n      - [AI Copilots](https://www.acorn.io/resources/learning-center/ai-copilots/)\n      - [AI Image](https://www.acorn.io/resources/learning-center/ai-image-generation/)\n      - [AI Video Generators](https://www.acorn.io/resources/learning-center/ai-video-generators/)\n      - [AI Summarization](https://www.acorn.io/resources/learning-center/ai-summarization/)\n      - [Code Interpreter](https://www.acorn.io/resources/learning-center/code-interpreter/)\n    \n    [Explore All Articles](/resources/learning-center)\n- [Docs](https://docs.gptscript.ai/)\n- [Tools](http://tools.gptscript.ai/)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [GitHub](https://github.com/gptscript-ai/gptscript)\n- [Company](https://www.acorn.io/about-us/)\n\n[Try GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release)\n\n[Request a Demo](/contact)\n\n[Learning Center](/resources/learning-center)\n\n# Fine-Tuning LLMs: Top 6 Methods, Challenges and Best Practices\n\n###### June 6, 2024 by acorn labs\n\n## What Does It Mean to Fine-Tune LLMs?\n\nFine-tuning Large Language Models (LLMs) involves adjusting pre-trained models on specific datasets to enhance performance for particular tasks. This process begins after general training ends. Users provide the model with a more focused dataset, which may include industry-specific terminology or task-focused interactions, with the objective of helping the model generate more relevant responses for a specific use case.\n\nFine-tuning allows the model to adapt its pre-existing weights and biases to fit specific problems better. This results in improved accuracy and relevance in outputs, making LLMs more effective in practical, specialized applications than their broadly trained counterparts. While fine-tuning can be highly computationally intensive, new techniques like Parameter-Efficient Fine-Tuning (PEFT) are making it much more efficient and possible to run even on consumer hardware.\n\nFine-tuning can be performed both on open source LLMs, such as Meta LLaMA and Mistral models, and on some commercial LLMs, if this capability is offered by the model’s developer. *(Learn more in our detailed guide to [fine tuning Llama 2](https://www.acorn.io/resources/learning-center/fine-tuning-llama-2))*. For example, OpenAI allows fine tuning for GPT-3.5 and GPT-4. This is part of an extensive series of guides about [machine learning](https://www.aporia.com/learn/machine-learning-model/machine-learning-models-use-cases-operations/).\n\n## Fine-Tuning vs. Embeddings vs. Prompt Engineering\n\n**Fine-tuning** is a method where a pre-trained model is further trained (or fine tuned) on a new dataset specific to a particular task. This technique involves adjusting the weights across all layers of the model, based on the new data. It allows the model to specifically cater to nuanced tasks and often results in higher performance for specialized applications.\n\n**Embeddings** refer to dense vector representations of words or phrases, which are typically obtained during the initial training of a model. Instead of adjusting the entire model, embeddings can be extracted and used as static input features for various downstream tasks. This approach does not modify the pre-trained model but leverages the learned representations. It’s generally faster and less resource-intensive than fine-tuning.\n\n**Prompt engineering** is another way to adjust LLMs to specific tasks. Adding more context, examples, or even entire documents and rich media, to LLM prompts can cause models to provide much more nuanced and relevant responses to specific tasks. Prompt engineering is considered more limited than fine-tuning, but is also much less technically complex and is not computationally intensive.\n\n***Learn more in our detailed guide to LLM fine tuning vs embedding (coming soon)***\n\n## When Does Your Business Need a Fine-Tuned Model?\n\nHere are a few primary use cases for fine-tuned LLMs:\n\n### Specificity and Relevance\n\nA fine-tuned model excels in providing highly specific and relevant outputs tailored to your business’s unique needs. Unlike general models, which offer broad responses, fine-tuning adapts the model to understand industry-specific terminology and nuances. This can be particularly beneficial for specialized industries like legal, medical, or technical fields where precise language and contextual understanding are crucial.\n\n### Improved Accuracy\n\nFine-tuning significantly enhances the accuracy of a language model by allowing it to adapt to the specific patterns and requirements of your business data. When a model is fine-tuned, it learns from a curated dataset that mirrors the particular tasks and language your business encounters. This focused learning process refines the model’s ability to generate precise and contextually appropriate responses, reducing errors and increasing the reliability of the outputs.\n\n### Data Privacy and Security\n\nIn many industries, maintaining data privacy and security is paramount. By fine-tuning a language model on proprietary or sensitive data, businesses can ensure that their unique datasets are not exposed to third-party risks associated with general model training environments. Fine-tuning can be conducted on-premises or within secure environments, keeping data control in-house.\n\n### Customized Interactions\n\nBusinesses that require highly personalized customer interactions can significantly benefit from fine-tuned models. These models can be trained to understand and respond to customer queries with a level of customization that aligns with the brand’s voice and customer service protocols. For instance, a fine-tuned model in a retail business can understand product-specific inquiries, offer personalized recommendations, understand company policies, and handle complex service issues more effectively than a general model.\n\n## Top 6 LLM Fine-Tuning Methods\n\nHere are some of the ways that large language models can be fine tuned.\n\n### 1. Instruction Fine-Tuning\n\nInstruction fine-tuning involves training a model using examples that demonstrate how it should respond to specific queries. For instance, to improve summarization skills, a dataset with instructions like \"summarize this text\" followed by the actual text is used.\n\nThis method helps the model learn to follow specific instructions and improve its performance in targeted tasks by understanding the expected outputs from given prompts. This approach is particularly useful for enhancing the model’s ability to handle various task-specific instructions effectively.\n\n### 2. Parameter-Efficient Fine-Tuning (PEFT)\n\nPEFT updates only a small subset of the model’s parameters during training, significantly reducing the memory and computational requirements compared to full fine-tuning. Techniques like LoRA (Low-Rank Adaptation) and QLoRA (Quantized Low-Rank Adaptation) can reduce the number of trainable parameters by thousands of times.\n\nThis method helps manage hardware limitations and prevents the phenomenon of ‘catastrophic forgetting’, maintaining the model’s original knowledge while adapting to new tasks. By focusing on specific components, PEFT makes the fine-tuning process more efficient and cost-effective, especially for large models.\n\n### 3. Task-Specific Fine-Tuning\n\nTask-specific fine-tuning focuses on adjusting a pre-trained model to excel in a particular task or domain using a dedicated dataset. This method typically requires more data and time than transfer learning but achieves higher performance in specific tasks, such as translation or sentiment analysis.\n\nDespite its effectiveness, it can lead to catastrophic forgetting, where the model loses proficiency in tasks it was previously trained on. However, by tailoring the model to specific requirements, task-specific fine-tuning ensures high accuracy and relevance for specialized applications.\n\n### 4. Transfer Learning\n\nTransfer learning leverages a model trained on a broad, general-purpose dataset and adapts it to specific tasks using task-specific data. This method is useful when data or resources are limited, as it builds on the knowledge already embedded in the pre-trained model, offering improved learning rates and accuracy with less training time compared to training from scratch.\n\nTransfer learning enables the efficient reuse of models like GPT or BERT for new applications, providing a strong foundation for further customization.\n\n***Learn more in our detailed guide to fine tuning vs transfer learning (coming soon)***\n\n### 5. Multi-Task Learning\n\nMulti-task learning trains a model on a dataset containing examples for multiple tasks, such as summarization, code translation, and entity recognition. This approach helps the model improve performance across different tasks simultaneously and avoids catastrophic forgetting.\n\nHowever, this method requires a large amount of diverse data, which can be challenging to assemble. The comprehensive training enables the model to handle various tasks proficiently, making it suitable for environments where versatile performance is necessary.\n\n### 6. Sequential Fine-Tuning\n\nSequential fine-tuning adapts a model to a series of related tasks in stages. For example, a general language model might first be fine-tuned for medical language and subsequently for pediatric cardiology. This method ensures the model retains its performance across various specialized domains, building on each successive fine-tuning step to refine its capabilities further.\n\nBy sequentially adapting to increasingly specific datasets, the model can achieve high proficiency in niche areas while maintaining a broad understanding of the general domain.\n\n## What Is Retrieval Augmented Generation (RAG)?\n\nRetrieval Augmented Generation (RAG) is a technique that combines natural language generation with information retrieval to enhance a model’s outputs with up-to-date and contextually relevant information. RAG integrates external knowledge sources, ensuring that the language model provides accurate and current responses. This method is particularly useful for tasks requiring precise, timely information, as it allows continuous updates and easy management of the knowledge base, avoiding the rigidity of traditional fine-tuning methods.\n\nRAG systems can dynamically retrieve information during generation, making them highly adaptable to changing data and capable of delivering more relevant and informed outputs. This technique is beneficial for applications where the accuracy and freshness of information are critical, such as customer support, content creation, and research. By leveraging RAG, businesses can ensure their language models remain current and provide high-quality responses that are well-grounded in the latest information available.\n\nNotable examples of the use of RAG are the [AI Overviews](https://blog.google/products/search/generative-ai-google-search-may-2024/) feature in Google search, and [Microsoft Copilot in Bing](https://www.bing.com/chat?form=CONVRD), both of which extract data from a live index of the Internet and use it as an input for LLM responses.\n\n## How to Choose a Pre-Trained Model for Fine-Tuning\n\nHere’s an overview of the process of identifying an existing LLM for fine-tuning.\n\n### Define the Task\n\nBefore fine-tuning, clearly define the model’s intended task. Understanding the task’s requirements helps in selecting a model whose pre-trained capabilities align closely with the end objectives. For example, it may involve classification, regression, or generative tasks.\n\nAn accurate task definition also aids in determining the necessary data scope for model fine-tuning. This can prevent potential performance degradation due to underfitting or overfitting during the fine-tuning phase.\n\n### Understand the Model Architecture\n\nGet familiar with different model architectures to select the most suitable one for your task. Each architecture has strengths and limitations based on its design principles, layers, and the type of data it was initially trained on.\n\nUnderstanding these characteristics can significantly impact the success of fine-tuning, as certain architectures might be more compatible with the nature of your specific tasks.\n\n### Assess Strengths and Weaknesses\n\nEvaluate the strengths and weaknesses of the model options. Some models may excel at handling text-based tasks while others may be optimized for voice or image recognition tasks. Standardized benchmarks, which you can find on LLM leaderboards, can help compare models on parameters relevant to your project.\n\nAdditionally, consider the model’s performance trade-offs such as accuracy, processing speed, and memory usage, which can affect the practical deployment of the fine tuned model in real-world applications.\n\n### Match with Task Requirements\n\nEnsure the pre-trained model’s capabilities match the demands of the task. This involves comparing the model’s training data, learning capabilities, and output formats with what’s needed for your use case. A close match between the model’s training conditions and your task’s requirements can enhance the effectiveness of the re-training process.\n\n***Related content: Read our guide to fine tuning LLM tutorial (coming soon)***\n\n## Challenges and Limitations of LLM Fine-Tuning\n\nHere are some of the challenges involved in fine-tuning large language models.\n\n### Overfitting\n\nOverfitting occurs when a model is trained so closely to the nuances of a specific dataset that it performs exceptionally well on that data but poorly on any data it hasn’t seen before. This is particularly problematic in fine-tuning because the datasets used are generally smaller and more specialized than those used in initial broad training phases.\n\nSuch datasets can include rare or unique examples that do not represent a broader population, causing the model to learn these as common features. Overfitting results in a model that lacks the ability to generalize, which is critical for practical applications where the input data may vary significantly from the training data.\n\n### Catastrophic Forgetting\n\nCatastrophic forgetting refers to a situation where a neural network, after being fine-tuned with new data, loses the information it had learned during its initial training. This challenge is especially significant in the fine-tuning of LLMs because the new, task-specific training can override the weights and biases that were useful across more general contexts.\n\nFor example, a model trained initially on a broad range of topics might lose its ability to comprehend certain general concepts if it is intensely retrained on a niche subject like legal documents or technical manuals.\n\n### Bias Amplification\n\nBias amplification is when inherent biases in the pre-trained data are intensified. During fine-tuning, a model may not only reflect but also exacerbate biases present in the new training dataset.\n\nFor example, if a dataset for fine-tuning an LLM on job application reviews contains biases against certain demographic groups, the model might amplify this bias, leading to discriminatory behavior in automated screening processes. This underscores the need for careful selection of datasets to avoid reinforcing harmful stereotypes or unfair practices in model outputs.\n\n### Hyperparameter Tuning Complexity\n\nHyperparameters, such as learning rate, batch size, and the number of epochs during which the model is trained, have a major impact on the model’s performance. These parameters need to be carefully adjusted to strike a balance between learning efficiently and avoiding overfitting. The optimal settings for hyperparameters vary between different tasks and datasets.\n\nThe process of identifying the right hyperparameter settings is time-consuming and computationally expensive, requiring extensive use of resources to run numerous training cycles. However, standardized methods, frameworks, and tools for LLM tuning are emerging, which aim to make this process easier.\n\n## LLM Fine-Tuning Best Practices\n\nHere are some of the measures you can take to ensure an effective LLM fine-tuning process.\n\n### Start with a Small Model\n\nBeginning with a smaller model can simplify the fine-tuning process. Smaller models require less computational power and memory, allowing for faster experimentation and iteration. This approach is particularly beneficial when resources are limited. Once the process is optimized on a smaller scale, the insights gained can be applied to fine-tune larger models.\n\n### Experiment with Different Data Formats\n\nExperimenting with various data formats can significantly enhance the effectiveness of fine-tuning. By including diverse input types—such as structured data, unstructured text, images, or even tabular data—models can learn to handle a broader range of real-world scenarios. This helps build versatility in the model’s responses, ensuring it performs well across different contexts and input variations.\n\n### Start with Subsets of Data\n\nStarting with fine-tuning on smaller subsets of the dataset allows for quicker iterations and helps identify potential issues early in the training process. By gradually scaling up to the full dataset, you can fine-tune hyperparameters and make necessary adjustments without expending excessive resources.\n\n### Ensure the Dataset Is High-Quality\n\nThe dataset should be representative of the specific task and domain to ensure the model learns the relevant patterns and nuances. High-quality data minimizes noise and errors, allowing the model to generate more accurate and reliable outputs. Investing time in curating and cleaning the dataset ensures improved model performance and generalization capabilities.\n\n### Use Hyperparameters to Optimize Performance\n\nHyperparameter tuning is vital for optimizing the performance of fine-tuned models. Key parameters like learning rate, batch size, and the number of epochs must be adjusted to balance learning efficiency and overfitting prevention. Systematic experimentation with different hyperparameter values can reveal the optimal settings, leading to improvements in model accuracy and reliability.\n\n## Building LLM Applications with Acorn\n\nVisit [https://gptscript.ai](https://gptscript.ai) to download GPTScript and start building today. As we expand on the capabilities with GPTScript, we are also expanding our list of tools. With these tools, you can create any application imaginable: check out [tools.gptscript.ai](https://tools.gptscript.ai/) to get started.\n\n## See Additional Guides on Key Machine Learning Topics\n\nTogether with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of [machine learning](https://www.aporia.com/learn/machine-learning-model/machine-learning-models-use-cases-operations/).\n\n### [Advanced Threat Protection](https://www.cynet.com/advanced-threat-protection/advanced-threat-protection-a-real-time-threat-killer-machine/)\n\n*Authored by Cynet*\n\n- [Advanced Threat Protection: A Real-Time Threat Killer Machine](https://www.cynet.com/advanced-threat-protection/advanced-threat-protection-a-real-time-threat-killer-machine/)\n- [Advanced Threat Detection: Catch & Eliminate Sneak Attacks](https://www.cynet.com/advanced-threat-protection/advanced-threat-detection-stopping-advanced-attacks-in-their-tracks/)\n- [What is Network Analytics? From Detection to Active Prevention](https://www.cynet.com/advanced-threat-protection/network-analytics-from-detection-to-active-prevention/)\n\n### [Multi GPU](https://www.run.ai/guides/multi-gpu)\n\n*Authored by Run.AI*\n\n- [Multi GPU: An In-Depth Look](https://www.run.ai/guides/multi-gpu)\n- [Keras Multi GPU: A Practical Guide](https://www.run.ai/guides/multi-gpu/keras-multi-gpu-a-practical-guide)\n- [How to Build Your GPU Cluster: Process and Hardware Options](https://www.run.ai/guides/multi-gpu/gpu-clusters)\n\n### [Best LLM](https://www.acorn.io/resources/learning-center/best-llm)\n\n*Authored by Acorn*\n\n- [Best LLM: Benchmarks, Leaderboards, & the World’s 8 Smartest LLMs](https://www.acorn.io/resources/learning-center/best-llm)\n- [Leaderboard of LLM Leaderboards: Top 7 LLM Listings & Their Criteria](https://www.acorn.io/resources/learning-center/llm-leaderboards)\n- [Open LLM Leaderboard: Benchmarks, Model Types & Filters Explained](https://www.acorn.io/resources/learning-center/open-llm-leaderboard#other-filter-options-in-the-open-llm-leaderboard)\n\n## Related Articles\n\n- [AI Copilots: Enterprise Use Cases and Key Considerations](https://www.acorn.io/resources/learning-center/ai-copilots/)\n- [Parameter-Efficient Fine-Tuning (PEFT): The Basics and a Quick Tutorial](https://www.acorn.io/resources/learning-center/parameter-efficient-fine-tuning-peft/)\n- [Aisera: Overview of Platform, Solutions, Pros and Cons](https://www.acorn.io/resources/learning-center/aisera/)\n- [Fine-Tuning Llama 2 with Hugging Face PEFT Library](https://www.acorn.io/resources/learning-center/fine-tuning-llama-2/)\n- [Prompt Engineering in ChatGPT: 9 Proven Techniques](https://www.acorn.io/resources/learning-center/prompt-engineering-in-chatgpt/)\n- [LLM Application Development: Tutorial & 7 Steps to Production Apps](https://www.acorn.io/resources/learning-center/llm-application-development/)\n- [Open LLM Leaderboard: Benchmarks, Model Types & Filters Explained](https://www.acorn.io/resources/learning-center/open-llm-leaderboard/)\n- [Best LLM: Benchmarks, Leaderboards, & the World’s 8 Smartest LLMs](https://www.acorn.io/resources/learning-center/best-llm/)\n\nMenu\n\nMenu Close\n\n- [About Us](https://www.acorn.io/about-us/)\n- [Contact Us](https://www.acorn.io/contact/)\n- [Tutorials](/resources/tutorials)\n- [Blog](/resources/blog)\n- [Events](/events)\n- [Discord](https://discord.com/invite/9sSf4UyAMC)\n- [Open Source](https://www.acorn.io/resources/blog/open-source/)\n\nTo unsubscribe at any time please see our [Privacy Policy](https://www.acorn.io/privacy-policy).\n\n#### Get Started with GPTScript\n\n[Install GPTScript](https://github.com/gptscript-ai/gptscript?tab=readme-ov-file#1-install-the-latest-release)\n\nCopyright © 2024. All rights reserved. Acorn Labs, Inc.\n\n[Terms of Service](https://www.acorn.io/terms-of-use)\n\n- [GitHub](https://github.com/gptscript-ai/gptscript)\n- [Twitter](https://x.com/acornlabs)\n- [YouTube](https://www.youtube.com/c/AcornLabs)\n- [LinkedIn](https://www.linkedin.com/company/acorn-io/)",
    "tool": "\"website-cleaner\"",
    "defaultModel": "llm"
  },
  "status": {
    "state": "error",
    "output": "",
    "endTime": "2024-11-28T00:23:32Z",
    "error": "failed to run: failed calling model for completion: error, The server had an error processing your request. Sorry about that! You can retry your request, or contact us through our help center at help.openai.com if you keep seeing this error."
  }
}