Allow concurrent LLM lookups

mrheinen commented 2 days ago

PR Type

Enhancement

Description

Implemented concurrent LLM lookups to improve performance and efficiency
Added a new configuration option llm_concurrent_requests to control the number of concurrent requests
Updated LLMManager to support parallel processing of multiple prompts
Refactored command injection responder to use concurrent LLM lookups
Added new tests for concurrent LLM completions and command injection responder
Updated dependencies to include a concurrent processing library
Modified configuration files and build scripts to support the new changes

PRDescriptionHeader.CHANGES_WALKTHROUGH

Relevant files

Enhancement

5 files

backend_main.go `Add concurrent requests parameter to LLMManager` cmd/backend/backend_main.go Updated `NewLLMManager` call to include a new parameter `cfg.Responder.LLMConcurrentRequests`	+1/-1
main.go `Add concurrent requests to LLMManager in main` cmd/llm/main.go Modified `NewLLMManager` call to include a new parameter for concurrent requests (set to 5)	+1/-1
llm_prompts.go `Refine command injection prompt` pkg/backend/responder/llm_prompts.go Updated the command injection prompt to be more specific about command-line commands Removed instructions about `$?` substring and echo behavior	+2/-5
llm_responder.go `Implement concurrent LLM lookups for command injection` pkg/backend/responder/llm_responder.go Implemented concurrent LLM lookups for command injection responses Added error handling for cases with no commands Used `CompleteMultiple` method from LLMManager for parallel processing	+22/-8
llm_manager.go `Implement concurrent LLM completions in LLMManager` pkg/llm/llm_manager.go Added `multiplePoolSize` field to LLMManager struct Implemented `CompleteMultiple` method for parallel prompt completions Updated `NewLLMManager` to include pool size parameter	+30/-2

Configuration changes

4 files

config.go `Add LLMConcurrentRequests to Responder config` pkg/backend/config.go Added new field `LLMConcurrentRequests` to the Responder struct Set default value for `LLMConcurrentRequests` to 5	+7/-6
backend-config.yaml `Add concurrent requests config option` config/backend-config.yaml Added new configuration option `llm_concurrent_requests` with default value 5	+2/-0
BUILD.bazel `Update Bazel build for responder tests` pkg/backend/responder/BUILD.bazel - Added new dependencies for responder tests	+6/-1
BUILD.bazel `Update Bazel build for LLM package` pkg/llm/BUILD.bazel - Added new dependency `@com_github_sourcegraph_conc//pool`	+1/-0

Tests

2 files

llm_responder_test.go `Add tests for command injection responder` pkg/backend/responder/llm_responder_test.go Added new test cases for command injection responder Implemented tests for single and multiple command completions	+55/-0
llm_manager_test.go `Add tests for concurrent LLM completions` pkg/llm/llm_manager_test.go Updated existing tests to include new pool size parameter Added new test case for `CompleteMultiple` method	+29/-2

Dependencies

3 files

deps.bzl `Add concurrent library dependency` deps.bzl Added new dependency `com_github_sourcegraph_conc` Updated version of `org_uber_go_multierr`	+9/-2
go.mod `Add concurrent library to go.mod` go.mod - Added new dependency `github.com/sourcegraph/conc v0.3.0`	+1/-0
go.sum `Update go.sum with new dependency` go.sum Added checksum for new dependency `github.com/sourcegraph/conc v0.3.0`	+2/-0

💡 PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

codiumai-pr-agent-pro[bot] commented 2 days ago

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪

🧪 PR contains tests

🔒 Security concerns

Command Injection:
The changes to the command injection prompt in 'pkg/backend/responder/llm_prompts.go' might make it less restrictive. This could potentially allow for command injection if not properly sanitized or validated. Careful review of the prompt and its usage is necessary to ensure it doesn't introduce security vulnerabilities.

⚡ Recommended focus areas for review

Potential Race Condition
The concurrent execution of LLM requests might lead to race conditions when writing to the shared 'res' variable. Consider using a thread-safe approach for combining results. Error Handling
The CompleteMultiple function returns early on the first error encountered. Consider collecting all errors and returning them along with successful results. Prompt Security
The updated command injection prompt might be less restrictive. Ensure it doesn't allow for potential security vulnerabilities through command execution.

github-actions[bot] commented 2 days ago

Failed to generate code suggestions for PR

codiumai-pr-agent-pro[bot] commented 2 days ago

PR Code Suggestions ✨

Explore these optional code suggestions:

Category	Suggestion	Score
Enhancement	Add a timeout to the parallel completion process ___ Consider adding a timeout or context with deadline to the CompleteMultiple call to ensure it doesn't run indefinitely if one of the completions takes too long. [pkg/backend/responder/llm_responder.go [49-70]](https://github.com/mrheinen/lophiid/pull/70/files#diff-c3e93e5adb1e4d4cdb55ce0b82481329eb36fe0db2b23559380252ff22c5969dR49-R70) ```diff func (l LLMManager) CompleteMultiple(prompts []string) (map[string]string, error) { + ctx, cancel := context.WithTimeout(context.Background(), l.completionTimeout) + defer cancel() + result := map[string]string{} - p := pool.New().WithErrors().WithMaxGoroutines(l.multiplePoolSize) + p := pool.New().WithErrors().WithMaxGoroutines(l.multiplePoolSize).WithContext(ctx) for _, prompt := range prompts { p.Go(func() error { localPrompt := prompt ret, err := l.Complete(localPrompt) if err != nil { return err } result[localPrompt] = ret return nil }) } err := p.Wait() return result, err } ``` - [ ] Apply this suggestion* Suggestion importance[1-10]: 8 Why: Adding a timeout to the CompleteMultiple function is crucial for preventing potential deadlocks or long-running operations. This significantly improves the reliability and responsiveness of the system.	8
Enhancement	Improve error handling for cases with no commands ___ Consider handling the case where no commands are found more gracefully. Instead of returning early with a fallback string, you could log a warning and continue processing with an empty result. [pkg/backend/responder/llm_responder.go [43-47]](https://github.com/mrheinen/lophiid/pull/70/files#diff-c3e93e5adb1e4d4cdb55ce0b82481329eb36fe0db2b23559380252ff22c5969dR43-R47) ```diff commands := util.SplitCommandsOnSemi(promptInput) if len(commands) == 0 { - slog.Debug("no commands found", slog.String("input", promptInput)) - return strings.Replace(template, LLMReplacementTag, LLMReplacementFallbackString, 1), nil + slog.Warn("no commands found, continuing with empty result", slog.String("input", promptInput)) + commands = []string{""} } ``` - [ ] Apply this suggestion Suggestion importance[1-10]: 6 Why: The suggestion improves error handling by continuing execution with an empty command instead of returning early. This enhances robustness and provides more flexibility in handling edge cases.	6
Possible issue	✅ Use a thread-safe map for concurrent result storage ___ Consider using a sync.Map instead of a regular map for the result in CompleteMultiple to avoid potential race conditions when writing results concurrently. [pkg/llm/llm_manager.go [49-70]](https://github.com/mrheinen/lophiid/pull/70/files#diff-079422eef64450f8a3d97a65b202d2a89a3850d6dd9282c9a8f930bd3731c26aR49-R70) ```diff func (l *LLMManager) CompleteMultiple(prompts []string) (map[string]string, error) { - result := map[string]string{} + var result sync.Map p := pool.New().WithErrors().WithMaxGoroutines(l.multiplePoolSize) for _, prompt := range prompts { p.Go(func() error { localPrompt := prompt ret, err := l.Complete(localPrompt) if err != nil { return err } - result[localPrompt] = ret + result.Store(localPrompt, ret) return nil }) } err := p.Wait() - return result, err + finalResult := make(map[string]string) + result.Range(func(key, value interface{}) bool { + finalResult[key.(string)] = value.(string) + return true + }) + + return finalResult, err } ``` `[Suggestion has been applied]` Suggestion importance[1-10]: 7 Why: Using a sync.Map instead of a regular map eliminates potential race conditions in concurrent operations. This enhances the thread safety and reliability of the code, especially in a multi-threaded environment.	7
Possible issue	Improve error handling for empty or missing LLM responses ___ Consider adding error handling for the case where `resMap[prompt]` is empty or invalid. This could help prevent potential issues if the LLM returns unexpected results. [pkg/backend/responder/llm_responder.go [60-65]](https://github.com/mrheinen/lophiid/pull/70/files#diff-c3e93e5adb1e4d4cdb55ce0b82481329eb36fe0db2b23559380252ff22c5969dR60-R65) ```diff for _, prompt := range promptInputs { val, ok := resMap[prompt] - if ok { + if ok && val != "" { res += val + } else { + slog.Warn("Empty or missing response for prompt", slog.String("prompt", prompt)) } } ``` - [ ] Apply this suggestion Suggestion importance[1-10]: 5 Why: The suggestion adds a check for empty responses and logs a warning, which improves error handling and debugging capabilities. While useful, it's a minor enhancement compared to the other suggestions.	5

💡 Need additional feedback ? start a PR chat

codiumai-pr-agent-pro[bot] commented 2 days ago

CI Failure Feedback 🧐

**Action:** build

**Failed stage:** [Build](https://github.com/mrheinen/lophiid/actions/runs/11581028370/job/32240847919) [❌]

**Failure summary:** The action failed due to a compilation error in the Go code:

The error occurred in the file pkg/llm/llm_manager.go at line 50, column 16.

The specific error is "undefined: sync", indicating that the sync package is being used but not
imported.

This compilation error prevented the successful building of the pkg/llm/llm_test.internal.a package.

The error occurred during the GoCompilePkg step of the build process.

Relevant error logs:

```yaml 1: ##[group]Operating System 2: Ubuntu ... 574: [145 / 338] GoCompilePkg external/com_github_jackc_pgx_v5/pgtype/pgtype.a; 3s linux-sandbox ... (4 actions, 3 running) 575: [157 / 338] GoCompilePkg external/com_github_jackc_pgx_v5/pgtype/pgtype.a; 4s linux-sandbox ... (4 actions, 3 running) 576: [186 / 338] [Prepa] GoCompilePkg pkg/database/database_test~testmain.a ... (4 actions, 3 running) 577: [212 / 338] GoCompilePkg pkg/database/database.a; 0s linux-sandbox ... (4 actions, 3 running) 578: [226 / 338] GoCompilePkg external/org_golang_google_protobuf/internal/impl/impl.a; 0s linux-sandbox ... (4 actions running) 579: [234 / 338] GoCompilePkg external/org_golang_x_text/collate/collate.a; 1s linux-sandbox ... (4 actions running) 580: [257 / 338] GoCompilePkg external/org_golang_x_text/collate/collate.a; 2s linux-sandbox ... (4 actions running) 581: [277 / 338] GoCompilePkg external/org_golang_google_grpc/internal/transport/transport.a; 0s linux-sandbox ... (4 actions, 3 running) 582: ERROR: /home/runner/work/lophiid/lophiid/pkg/llm/BUILD.bazel:21:8: GoCompilePkg pkg/llm/llm_test.internal.a failed: (Exit 1): builder failed: error executing GoCompilePkg command (from target //pkg/llm:llm_test) bazel-out/k8-opt-exec-ST-d57f47055a04/bin/external/go_sdk/builder_reset/builder compilepkg -sdk external/go_sdk -installsuffix linux_amd64 -src pkg/llm/llm.go -src pkg/llm/llm_manager.go -src ... (remaining 37 arguments skipped) 583: Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging 584: ##[error]pkg/llm/llm_manager.go:50:16: undefined: sync 585: compilepkg: error running subcommand external/go_sdk/pkg/tool/linux_amd64/compile: exit status 2 586: Use --verbose_failures to see the command lines of failed build steps. 587: INFO: Elapsed time: 65.047s, Critical Path: 45.59s 588: INFO: 285 processes: 59 internal, 226 linux-sandbox. 589: ERROR: Build did NOT complete successfully 590: ##[error]Process completed with exit code 1. ```

✨ CI feedback usage guide:

The CI feedback tool (`/checks)` automatically triggers when a PR has a failed check. The tool analyzes the failed checks and provides several feedbacks: - Failed stage - Failed test name - Failure summary - Relevant error logs In addition to being automatically triggered, the tool can also be invoked manually by commenting on a PR: ``` /checks "https://github.com/{repo_name}/actions/runs/{run_number}/job/{job_number}" ``` where `{repo_name}` is the name of the repository, `{run_number}` is the run number of the failed check, and `{job_number}` is the job number of the failed check. #### Configuration options - `enable_auto_checks_feedback` - if set to true, the tool will automatically provide feedback when a check is failed. Default is true. - `excluded_checks_list` - a list of checks to exclude from the feedback, for example: ["check1", "check2"]. Default is an empty list. - `enable_help_text` - if set to true, the tool will provide a help message with the feedback. Default is true. - `persistent_comment` - if set to true, the tool will overwrite a previous checks comment with the new feedback. Default is true. - `final_update_message` - if `persistent_comment` is true and updating a previous checks message, the tool will also create a new message: "Persistent checks updated to latest commit". Default is true. See more information about the `checks` tool in the [docs](https://pr-agent-docs.codium.ai/tools/ci_feedback/).

mrheinen / lophiid