mrheinen / lophiid

A distributed honeypot for monitoring large scale web attacks
GNU General Public License v2.0
6 stars 1 forks source link

Improve logging and metrics of LLM responder #62

Closed mrheinen closed 1 week ago

mrheinen commented 1 week ago

PR Type

Enhancement, Documentation


Description


Changes walkthrough ๐Ÿ“

Relevant files
Error handling
llm_responder.go
Enhance error logging in LLM responder                                     

pkg/backend/responder/llm_responder.go
  • Added more verbose error logging using slog.Error()
  • Improved error handling for input length and responder type validation
  • Enhanced error reporting for LLM completion failures
  • +4/-0     
    Enhancement
    llm_manager.go
    Add error count metric for LLM completions                             

    pkg/llm/llm_manager.go - Added metric increment for failed LLM completions
    +1/-0     
    metrics.go
    Implement LLM error count metric                                                 

    pkg/llm/metrics.go
  • Added new llmErrorCount prometheus Counter metric
  • Registered the new metric in CreateLLMMetrics function
  • +7/-0     
    Tests
    llm_manager_test.go
    Add test for LLM completion error counting                             

    pkg/llm/llm_manager_test.go
  • Added new test case TestCompleteErrorCounted to verify error counting
  • Imported errors and testutil packages
  • +23/-0   
    Documentation
    SCRIPTING.md
    Document logging utilities in SCRIPTING.md                             

    SCRIPTING.md
  • Added documentation for logging utilities
  • Described available log levels and methods
  • +26/-0   
    Configuration changes
    BUILD.bazel
    Update Bazel build file for new test dependency                   

    pkg/llm/BUILD.bazel - Added dependency on `prometheus/testutil` for testing
    +1/-0     

    ๐Ÿ’ก PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    codiumai-pr-agent-pro[bot] commented 1 week ago

    PR Reviewer Guide ๐Ÿ”

    Here are some key observations to aid the review process:

    โฑ๏ธ Estimated effort to review: 3 ๐Ÿ”ต๐Ÿ”ต๐Ÿ”ตโšชโšช
    ๐Ÿงช PR contains tests
    ๐Ÿ”’ No security concerns identified
    โšก Recommended focus areas for review

    Error Handling
    Verify that the new error logging doesn't expose sensitive information in the logs, especially in the promptInput. Metric Increment
    Ensure that the new metric increment for failed LLM completions is placed correctly and doesn't interfere with other error handling or logging. Test Coverage
    Check if the new test case `TestCompleteErrorCounted` covers all scenarios for error counting, including edge cases.
    github-actions[bot] commented 1 week ago

    Failed to generate code suggestions for PR

    codiumai-pr-agent-pro[bot] commented 1 week ago

    PR Code Suggestions โœจ

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Enhancement
    Track successful completions to provide a more comprehensive view of LLM performance ___ **Consider adding a metric to track successful LLM completions alongside the error
    count. This will provide a more comprehensive view of the LLM's performance.** [pkg/llm/llm_manager.go [58]](https://github.com/mrheinen/lophiid/pull/62/files#diff-079422eef64450f8a3d97a65b202d2a89a3850d6dd9282c9a8f930bd3731c26aR58-R58) ```diff l.metrics.llmErrorCount.Inc() +l.metrics.llmSuccessCount.Inc() ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 8 Why: This suggestion is highly valuable as it introduces a metric for successful completions, which complements the existing error count. This provides a more comprehensive view of the LLM's performance and can aid in monitoring and debugging.
    8
    Add a success counter metric to provide a more complete picture of LLM performance ___ **Consider adding a success counter metric to complement the error counter, providing
    a more complete picture of LLM performance.** [pkg/llm/metrics.go [25-28]](https://github.com/mrheinen/lophiid/pull/62/files#diff-2a2c8d4e9325b1f0ca7bc312f1034e9252b79b36983a74583086b898c9f42879R25-R28) ```diff type LLMMetrics struct { llmQueryResponseTime prometheus.Histogram llmErrorCount prometheus.Counter + llmSuccessCount prometheus.Counter } ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 8 Why: This suggestion aligns well with the second suggestion and is crucial for implementing comprehensive LLM performance monitoring. Adding a success counter metric provides a balanced view of both successful and failed LLM operations.
    8
    Add a test case for successful LLM completion to ensure comprehensive test coverage ___ **Consider adding a test case for successful LLM completion to ensure both success and
    error scenarios are covered.** [pkg/llm/llm_manager_test.go [49-68]](https://github.com/mrheinen/lophiid/pull/62/files#diff-e03782aea4d40f7c5571a29769371132bccee06d8d77503a54ae4374e6543489R49-R68) ```diff func TestCompleteErrorCounted(t *testing.T) { + // ... (existing test case) +} + +func TestCompleteSuccessCounted(t *testing.T) { testCompletionString := "completion" - client := MockLLMClient{CompletionToReturn: testCompletionString, ErrorToReturn: errors.New("beh")} + client := MockLLMClient{CompletionToReturn: testCompletionString, ErrorToReturn: nil} pCache := util.NewStringMapCache[string]("", time.Second) pReg := prometheus.NewRegistry() metrics := CreateLLMMetrics(pReg) lm := NewLLMManager(&client, pCache, metrics, time.Hour) _, err := lm.Complete("aaaa") - if err == nil { - t.Errorf("expected error") + if err != nil { + t.Errorf("unexpected error: %v", err) } - count := testutil.ToFloat64(metrics.llmErrorCount) + count := testutil.ToFloat64(metrics.llmSuccessCount) if count != 1 { - t.Errorf("expected 1 error, got %f", count) + t.Errorf("expected 1 success, got %f", count) } } ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 7 Why: Adding a test case for successful LLM completion is a good practice to ensure comprehensive test coverage. It complements the existing error case test and helps validate the correct behavior of the LLM manager in both success and failure scenarios.
    7
    Best practice
    Use structured logging for better log analysis and consistency ___ **Consider using structured logging with fields instead of string formatting for the
    error message. This will make it easier to parse and analyze logs.** [pkg/backend/responder/llm_responder.go [27]](https://github.com/mrheinen/lophiid/pull/62/files#diff-c3e93e5adb1e4d4cdb55ce0b82481329eb36fe0db2b23559380252ff22c5969dR27-R27) ```diff -return "", fmt.Errorf("input too long (size: %d)", len(promptInput)) +return "", fmt.Errorf("input too long") ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 5 Why: The suggestion to use structured logging is valid and can improve log analysis. However, the proposed change removes useful information (the input size) from the error message, which is not ideal.
    5

    ๐Ÿ’ก Need additional feedback ? start a PR chat