Pressing the stop button did not interrupt completion process on LLM server, only hiding output.

hronoas commented 1 year ago

Before submitting your bug report

[X] I believe this is a bug. I'll try to join the Continue Discord for questions
[X] I'm not able to find an open issue that reports the same bug
[X] I've seen the troubleshooting guide on the Continue Docs

Relevant environment info

- OS: Windows 11 22621.1555
- Continue: 0.1.30
- IDE: VSCode 1.83.1
- text-generation-webui (2023/10/21)

Description

I use a local text-generation-webui server with the openai plugin. If you click the stop button in the continue chat after the generation of a response has begun, the output of text to the chat stops, but text-generation-webui continues to generate a response.

When trying to add a custom stop function to a model class in ~/.continue/config.py like this:

class OpenAILocal(OpenAI):
  async def stop(self):
    logger.debug(f"STOPING!")

config = ContinueConfig(
  allow_anonymous_telemetry=False,
  models=Models(
    default=OpenAILocal(
      title="dolphin-2.1-mistral-7b k8k",
      model="dolphin-2.1-mistral-7b",
      context_length=8192,
      api_key="sd-dummy",
      api_base="http://127.0.0.1:5001/v1"
    )
...

logger.debug(f"STOPING!") - never called, neither when a button is pressed, nor when changing a model, nor when closing VSCode

To reproduce

Use local text-generation-webui and watch for GPU utilization
Ask a question in chat with AI to produce long answer.
Press Stop button while AI answering.
GPU utilization continue for some time, while infer is not ended

sestinj commented 1 year ago

@hronoas I've done some testing with your exact setup and was definitely seeing that the GPU usage continues to be high for some time after pressing stop, like you said. I wondered if this could be due to us generating summaries of the message for a) the session title and b) chat context. We have a setting in ContinueConfig, disable_summaries=True. I tried this and it seemed then that the GPU usage dropped immediately after pressing stop.

Are you able to verify similar behavior?

(also, the "stop" method is a bit unrelated to streaming of responses. This is a method for cleaning up the model altogether (and probably ought to be named this way instead), for example closing any network connections or other resources)

hronoas commented 1 year ago

@sestinj Everything is as you said

Tested with the disable_summaries=False option: Stoped first response: does not interrupt the load on the GPU Stoped all the following messages: interrupts the load on the GPU

Tested with the disable_summaries=True option: When you click the stop button, the load on the GPU stops in both cases.

Example output from text-generation-webui of interrupted first response:

**PROMPT:** ``` ### System Prompt Below is an instruction that describes a task. Write a response that appropriately completes the request. ### User Message Write instruction, how to develop lagge programs. ### Assistant ``` **RESPONSE:** ``` Title: Developing Large Programs - A Step-by-Step Guide Introduction: Developing large software programs can be both challenging and rewarding. With proper planning and execution, you can create efficient, maintainable code that serves its purpose effectively. This guide will walk through the essential steps in developing large programs. 1. Define project scope and objectives: Before diving into the programming process, it's crucial to have a clear understanding of what your program needs to achieve. Make a list of functionalities and features that your program must possess, as well as any constraints or limitations. 2. Break down the project into modules: To make development manageable, break your large program into smaller, more digestible pieces called modules. Each module should have a specific functionality or set of functionalities. This way, you can work on one module at a time, making debugging and testing easier. 3. Plan your software architecture: Design the overall structure of your program. Think about how different modules will interact with each other and decide on the best approach for communication between them (e.g., using APIs, message queues). Consider potential scalability requirements and security measures. 4. Choose appropriate technologies and tools: Based on your research and planning, select the most suitable programming languages, frameworks, libraries, databases, and tools that align with your project goals. Keep in mind factors like performance, ease of use, community support, etc. 5. Set up version control: Use a version control system like Git to track changes in your source code. It allows multiple developers to work on the same project simultaneously, ensuring smooth collaboration. Also, it enables easy rollback to previous versions if needed. 6. Implement coding standards and best practices: Establish and follow consistent coding conventions, style guides, and best practices recommended by the programming language's community. Adherence to these principles helps improve code readability and maintainability. 7. Write unit tests: Writing unit tests prior to implementing the actual functions will ensure that your code works as expected and will continue to work as new features are added. 8. Code review and testing: Involve fellow team members in the code review process to identify and fix bugs, optimize performance, and enhance code quality. Perform thorough testing to uncover defects before deploying your program into production. 9. Continuous integration and continuous deployment (CI/CD): Automate the build, test, package, and deployment processes using CI/CD pipelines. Regularly push updates to ensure your program remains stable and secure. Conclusion: Large program development involves careful planning, effective communication, and adherence to best practices. By following this step-by-step ``` **PROMPT:** ``` ### System Prompt Below is an instruction that describes a task. Write a response that appropriately completes the request. ### User Message "Title: Developing Large Programs - A Step-by-Step Guide Introduction: Developing large software programs can be both challenging and rewarding. With proper planning and execution, you can create efficient, maintainable code that serves its purpose effectively. This guide will walk through the essential steps in developing large programs. 1. Define project" Please write a short title summarizing the message quoted above. Use no more than 10 words: ### Assistant ``` **RESPONSE:** ``` "Large Software Development: Planning & Execution Strategies" ``` **PROMPT:** ``` ### System Prompt Below is an instruction that describes a task. Write a response that appropriately completes the request. ### User Message ### Instruction: Write instruction, how to develop lagge programs. ### Response: Title: Developing Large Programs - A Step-by-Step Guide Introduction: Developing large software programs can be both challenging and rewarding. With proper planning and execution, you can create efficient, maintainable code that serves its purpose effectively. This guide will walk through the essential steps in developing large programs. 1. Define project ### Response: Give a short title to describe the above chat session. Do not put quotes around the title. Do not use more than 6 words. The title is: ### Assistant ``` **RESPONSE:** ``` How to Develop Large Software Projects ``` Final chat session title in VSCode continue is: `How to Develop Large Software Projects`

hronoas commented 1 year ago

I'm not understand... why generating second title... 😨 maybe bug? or incorrect config?

First generate: Please write a short title summarizing the message quoted above. Use no more than 10 words: by continuedev\plugins\steps\chat.py

Second generate: Give a short title to describe the above chat session. Do not put quotes around the title. Do not use more than 6 words. The title is: by continuedev\core\autopilot.py

sestinj commented 1 year ago

Ok, great. This means there is no bug, but since the summaries are no longer being used frequently it might be best to disable them by default.

There are 2 titles because 1 of them is for the main session title, and another is a summary that is generated for every LLM response

hronoas commented 1 year ago

@sestinj For my mind it's a bug... For efficient use of resources, when you press the stop button, it is advisable to interrupt the generation from LLM and perhaps not create a chat title from the first response. On low performanse hardware generation of long responses can continue minutes and can not be interrupted.

Please check Example output from text-generation-webui of interrupted first response in my message. All title generations are use not full response from LLM, used only response generated before stop button pressed.

Response was:

Title: Developing Large Programs - A Step-by-Step Guide Introduction: Developing large software programs can be both challenging and rewarding. With proper planning and execution, you can create efficient, maintainable code that serves its purpose effectively. This guide will walk through the essential steps in developing large programs. 1. Define project scope and objectives: Before diving into the programming process, it's crucial to have a clear understanding of what your program needs to achieve. Make a list of functionalities and features that your program must possess, as well as any constraints or limitations. 2. Break down the project into modules: To make development manageable, break your large program into smaller, more digestible pieces called modules. Each module should have a specific functionality or set of functionalities. This way, you can work on one module at a time, making debugging and testing easier. 3. Plan your software architecture: Design the overall structure of your program. Think about how different modules will interact with each other and decide on the best approach for communication between them (e.g., using APIs, message queues). Consider potential scalability requirements and security measures. 4. Choose appropriate technologies and tools: Based on your research and planning, select the most suitable programming languages, frameworks, libraries, databases, and tools that align with your project goals. Keep in mind factors like performance, ease of use, community support, etc. 5. Set up version control: Use a version control system like Git to track changes in your source code. It allows multiple developers to work on the same project simultaneously, ensuring smooth collaboration. Also, it enables easy rollback to previous versions if needed. 6. Implement coding standards and best practices: Establish and follow consistent coding conventions, style guides, and best practices recommended by the programming language's community. Adherence to these principles helps improve code readability and maintainability. 7. Write unit tests: Writing unit tests prior to implementing the actual functions will ensure that your code works as expected and will continue to work as new features are added. 8. Code review and testing: Involve fellow team members in the code review process to identify and fix bugs, optimize performance, and enhance code quality. Perform thorough testing to uncover defects before deploying your program into production. 9. Continuous integration and continuous deployment (CI/CD): Automate the build, test, package, and deployment processes using CI/CD pipelines. Regularly push updates to ensure your program remains stable and secure. Conclusion: Large program development involves careful planning, effective communication, and adherence to best practices. By following this step-by-step

For title generation used:

Title: Developing Large Programs - A Step-by-Step Guide Introduction: Developing large software programs can be both challenging and rewarding. With proper planning and execution, you can create efficient, maintainable code that serves its purpose effectively. This guide will walk through the essential steps in developing large programs. 1. Define project

sestinj commented 1 year ago

@hronoas I see what you're saying now. I just made the fix so that if you press stop the title + summary will not be generated.

Once this change has been shipped in a new version I'll let you know!

sestinj commented 1 year ago

@hronoas This change has been shipped so going to close the issue. Let me know if you're seeing anything else problematic, or feel free to re-open!

continuedev / continue