feat: latest comfyui; fix: better GPU utilization for SD15

tazlin commented 2 months ago

Changes/fixes:

Significant improvements to crash recovery.
- The worker will no longer crash when there are no jobs for long periods of time.
- The main process is much more capable of recovering from a sub-process crash
- The worker will now detect more deadlocks (which are ordinarily impossible but may arise due to difficult to reproduce edge cases) and attempt to recover.
Additional log messages and warnings under certain situations, along with some recommendations to resolve.
- More info is printed out by default in the periodic status message and with more clarity as to its meaning.
- If the worker will pause popping jobs (such as if too many fail consecutively), a warning that this is happening will appear in every status message.
- If more than several minutes are spent with no jobs, the worker warns that offering more models can potentially prevent this.
Flux support
- Add "Flux.1-Schnell fp8 (Compact)" to your models_to_load to offer.
Updates the README.md to have some additional information about worker configuration.
Updated the bridgeData_template.yaml for clarity and new configuration options.
Added configuration options extra_slow_worker, limit_max_steps, unload_from_vram_often, high_memory_mode
- See the updated template and README.md "Suggested settings" section for more information.
  Relies on:
- https://github.com/Haidra-Org/horde-sdk/pull/240
- AI-Horde
- https://github.com/Haidra-Org/AI-Horde/pull/450
- https://github.com/Haidra-Org/AI-Horde/pull/451
- hordelib changes
- https://github.com/Haidra-Org/hordelib/pull/308
- https://github.com/Haidra-Org/hordelib/pull/310
- https://github.com/Haidra-Org/hordelib/pull/311
- https://github.com/Haidra-Org/hordelib/pull/318
- https://github.com/Haidra-Org/hordelib/pull/319
- https://github.com/Haidra-Org/hordelib/pull/325
- https://github.com/Haidra-Org/hordelib/pull/328
- https://github.com/Haidra-Org/hordelib/pull/330
- https://github.com/Haidra-Org/hordelib/pull/335

tazlin commented 1 month ago

@CodiumAI-Agent /review

CodiumAI-Agent commented 1 month ago

PR Reviewer Guide 🔍

(Review updated until commit https://github.com/Haidra-Org/horde-worker-reGen/commit/f89812c17a691e606061ca5eb2d603b164b5df8a)

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪

🧪 No relevant tests

🔒 No security concerns identified

⚡ Key issues to review

Performance Concern
The method `remove_maintenance` is added to remove maintenance mode from a worker. However, the method makes synchronous network calls (`simple_client.worker_details_by_name` and `simple_client.worker_modify`) within an asynchronous context. This could block the event loop and affect the performance of the application. Consider refactoring these calls to be asynchronous or running them in a separate thread. Redundant Code
The method `_receive_and_handle_control_message` contains a condition to check if `message.control_flag` is `HordeControlFlag.START_INFERENCE` and then preloads a model if not already active. However, the method `preload_model` is called again inside the condition, which seems redundant and could lead to unnecessary preloading of the model. This could be optimized to avoid potential performance issues. Configuration Overlap
The `start_inference_process` function has parameters `low_memory_mode`, `high_memory_mode`, and `very_high_memory_mode` that could potentially overlap in functionality. This might lead to confusing behavior depending on how these flags are set. It's recommended to clarify the precedence and interaction of these modes in the documentation or refactor the approach to handle memory management settings more clearly.

CodiumAI-Agent commented 1 month ago

Persistent review updated to latest commit https://github.com/Haidra-Org/horde-worker-reGen/commit/f89812c17a691e606061ca5eb2d603b164b5df8a

Haidra-Org / horde-worker-reGen