mrheinen / lophiid

A distributed honeypot for monitoring large scale web attacks
GNU General Public License v2.0
6 stars 1 forks source link

Refactor session management and store sessions in cache and database #73

Closed mrheinen closed 1 day ago

mrheinen commented 1 day ago

User description

Up to now sessions weren't a real entity in lophiid. They were inferred from database records. This change makes sessions a real thing. They are stored in the database and a memory cache. For performance reasons the session in the database are only stored on creation and updated on completion of the session (timeout).

Maybe in the future we can actually update the sessions in the database periodically and add some session information to the UI as well.


PR Type

Enhancement, Bug fix


Description


Changes walkthrough πŸ“

Relevant files
Enhancement
7 files
backend_main.go
Integrate session manager in backend main                               

cmd/backend/backend_main.go
  • Added session manager initialization and cleanup of stale sessions
  • Updated backend server creation to include session manager
  • +21/-1   
    backend.go
    Refactor backend to use session manager                                   

    pkg/backend/backend.go
  • Removed ruleVsCache and related methods
  • Added sessionMgr to BackendServer struct
  • Updated GetMatchedRule to use session manager
  • Added UpdateSessionWithRule method
  • +26/-41 
    rule_cache.go
    Remove rule cache implementation                                                 

    pkg/backend/rule_cache.go - Entire file removed
    +0/-76   
    session.go
    Implement session management                                                         

    pkg/backend/session/session.go
  • Implemented SessionManager interface
  • Added DatabaseSessionManager struct and methods
  • +139/-0 
    database.go
    Add session support to database                                                   

    pkg/database/database.go
  • Added Session struct and related methods
  • Updated DatabaseClient interface
  • +41/-6   
    string_map_cache.go
    Enhance string map cache with callback                                     

    pkg/util/string_map_cache.go - Added `StartWithCallback` method to `StringMapCache`
    +16/-0   
    database.sql
    Add session table to database schema                                         

    config/database.sql - Added `session` table definition
    +10/-0   
    Tests
    2 files
    backend_test.go
    Update backend tests for session management                           

    pkg/backend/backend_test.go
  • Updated test cases to include session manager
  • Removed tests related to removed ruleVsCache
  • +26/-18 
    session_test.go
    Add tests for session manager                                                       

    pkg/backend/session/session_test.go - Added tests for session manager functionality
    +94/-0   
    Configuration changes
    4 files
    config.go
    Remove attack tracking duration from config                           

    pkg/backend/config.go - Removed `AttackTrackingDuration` from config
    +0/-1     
    backend-config.yaml
    Update backend configuration                                                         

    config/backend-config.yaml - Removed `attack_tracking_duration` configuration
    +0/-2     
    BUILD.bazel
    Update Bazel build for backend changes                                     

    pkg/backend/BUILD.bazel
  • Removed rule_cache.go and rule_cache_test.go
  • Added session dependency
  • +2/-2     
    BUILD.bazel
    Add Bazel build for session package                                           

    pkg/backend/session/BUILD.bazel - Added new Bazel build file for session package
    +19/-0   

    πŸ’‘ PR-Agent usage: Comment /help "your question" on any pull request to receive relevant information

    codiumai-pr-agent-pro[bot] commented 1 day ago

    PR Reviewer Guide πŸ”

    Here are some key observations to aid the review process:

    ⏱️ Estimated effort to review: 4 πŸ”΅πŸ”΅πŸ”΅πŸ”΅βšͺ
    πŸ§ͺ PR contains tests
    πŸ”’ No security concerns identified
    ⚑ Recommended focus areas for review

    Potential Race Condition
    The `UpdateSessionWithRule` method updates the session state without apparent synchronization. This could lead to race conditions if multiple goroutines access the same session concurrently. Error Handling
    The `CleanupStaleSessions` method continues processing even if there's an error ending a session. This might lead to inconsistent state or missed cleanup for some sessions. Potential Memory Leak
    The `RuleIDsServed` map in the `Session` struct is initialized but never cleaned up. This could potentially lead to memory growth over time if sessions accumulate a large number of served rules.
    github-actions[bot] commented 1 day ago

    Failed to generate code suggestions for PR

    codiumai-pr-agent-pro[bot] commented 1 day ago

    PR Code Suggestions ✨

    Explore these optional code suggestions:

    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Add error handling for cache storage operation in session management ___ **Consider adding error handling for the d.activeSessions.Store(ip, retSession) call
    in the StartSession method. If this operation fails, it could lead to
    inconsistencies between the database and the cache.** [pkg/backend/session/session.go [137-139]](https://github.com/mrheinen/lophiid/pull/73/files#diff-f6cfd2239c48d34a79a86f35c0074d4672fe9ef6ad474cc61104ab339c4192d5R137-R139) ```diff retSession := dm.(*database.Session) -d.activeSessions.Store(ip, retSession) +if err := d.activeSessions.Store(ip, retSession); err != nil { + return nil, fmt.Errorf("error storing session in cache: %w", err) +} return retSession, nil ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 8 Why: This suggestion addresses a potential bug by adding error handling for a critical operation. It prevents silent failures and improves reliability.
    8
    Best practice
    βœ… Remove trailing comma in SQL table definition for better compatibility ___
    Suggestion Impact:The trailing comma was removed from the last column definition in the 'session' table creation statement, as suggested. However, an additional column 'ended_at' was also added. code diff: ```diff - updated_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())), + ended_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())), + updated_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())) ```
    ___ **Remove the trailing comma after the last column definition in the 'session' table
    creation statement to ensure compatibility with all SQL databases.** [config/database.sql [146-153]](https://github.com/mrheinen/lophiid/pull/73/files#diff-002629893f0a5eb6f8e9a6d036047c92784e918c0afc132f6992827aae6121bcR146-R153) ```diff CREATE TABLE session ( id SERIAL PRIMARY KEY, active BOOL DEFAULT FALSE, ip VARCHAR(52) NOT NULL, started_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())), created_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())), - updated_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())), + updated_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())) ); ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 8 Why: Removing the trailing comma after the last column definition is crucial for ensuring compatibility across all SQL databases. This small change can prevent potential syntax errors and improve the portability of the database schema.
    8
    Use context for graceful shutdown in cache management goroutine ___ **Consider using a context for graceful shutdown instead of a separate channel. This
    would allow for more consistent cancellation patterns across the application.** [pkg/util/string_map_cache.go [171-184]](https://github.com/mrheinen/lophiid/pull/73/files#diff-0ab41f6168af62cb8c68239c7c78cd0e807513c6264c91521b0f218500cbdb17R171-R184) ```diff -func (r *StringMapCache[T]) StartWithCallback(callback func(T) bool) { +func (r *StringMapCache[T]) StartWithCallback(ctx context.Context, callback func(T) bool) { ticker := time.NewTicker(time.Minute * 1) go func() { + defer ticker.Stop() for { select { - case <-r.bgChan: - ticker.Stop() + case <-ctx.Done(): return case <-ticker.C: r.CleanExpiredWithCallback(callback) } } }() } ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 7 Why: This suggestion improves the shutdown mechanism and aligns with Go best practices. It's a valuable improvement for resource management and consistency.
    7
    Performance
    βœ… Add an index to improve query performance on frequently accessed columns ___
    Suggestion Impact:The commit implemented an index on the 'session' table that includes the 'ip' column, along with 'started_at' and 'active' columns code diff: ```diff +CREATE INDEX session_ip ON session ( + started_at desc, + active, + ip +); ```
    ___ **Consider adding an index on the 'ip' column of the 'session' table to improve query
    performance when searching or filtering by IP address.** [config/database.sql [146-153]](https://github.com/mrheinen/lophiid/pull/73/files#diff-002629893f0a5eb6f8e9a6d036047c92784e918c0afc132f6992827aae6121bcR146-R153) ```diff CREATE TABLE session ( id SERIAL PRIMARY KEY, active BOOL DEFAULT FALSE, ip VARCHAR(52) NOT NULL, started_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())), created_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())), updated_at TIMESTAMP NOT NULL DEFAULT (timezone('utc', now())), ); +CREATE INDEX idx_session_ip ON session(ip); + ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 7 Why: Adding an index on the 'ip' column can significantly improve query performance for operations involving IP addresses, which are likely to be frequent in a session table. This is a valuable optimization for database efficiency.
    7
    Maintainability
    Refactor test setup to use a helper function for creating BackendServer instances ___ **Consider using a test helper function to create the BackendServer instance with
    common parameters. This would reduce code duplication and make the tests more
    maintainable.** [pkg/backend/backend_test.go [200-203]](https://github.com/mrheinen/lophiid/pull/73/files#diff-841290eccbd676c43597489825cc8f1628359283c634321e496e76e31f6fed9cR200-R203) ```diff fSessionMgr := session.NewDatabaseSessionManager(fdbc, time.Hour) fIpMgr := analysis.FakeIpEventManager{} fakeRes := &responder.FakeResponder{} -b := NewBackendServer(fdbc, bMetrics, &fakeJrunner, alertManager, &vt.FakeVTManager{}, &whoisManager, &queryRunner, &fakeLimiter, &fIpMgr, fakeRes, fSessionMgr, GetDefaultBackendConfig()) +b := createTestBackendServer(fdbc, fSessionMgr, &fIpMgr, fakeRes) +// Helper function (defined elsewhere in the test file): +// func createTestBackendServer(fdbc *database.FakeDatabaseClient, fSessionMgr session.SessionManager, fIpMgr *analysis.FakeIpEventManager, fakeRes *responder.FakeResponder) *BackendServer { +// return NewBackendServer(fdbc, bMetrics, &fakeJrunner, alertManager, &vt.FakeVTManager{}, &whoisManager, &queryRunner, &fakeLimiter, fIpMgr, fakeRes, fSessionMgr, GetDefaultBackendConfig()) +// } + ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 6 Why: This suggestion improves code maintainability by reducing duplication in test setup. It's a good practice but not critical for functionality.
    6
    Extract session cleanup logic into a separate function for improved code organization ___ **Consider extracting the session cleanup logic into a separate function for better
    readability and maintainability. This would also make it easier to test the cleanup
    process independently.** [cmd/backend/backend_main.go [201-217]](https://github.com/mrheinen/lophiid/pull/73/files#diff-aea9d0ad7326cd6083f662c484917c42ab9ff66183cbe981873c671cedc76b7cR201-R217) ```diff -slog.Info("Cleaning up any stale sessions") -totalSessionsCleaned := 0 - -for { - cnt, err := sessionMgr.CleanupStaleSessions(50) - if err != nil { - slog.Error("Error cleaning up sessions: %s", err) - return +func cleanupStaleSessions(sessionMgr *session.DatabaseSessionManager) (int, error) { + totalSessionsCleaned := 0 + for { + cnt, err := sessionMgr.CleanupStaleSessions(50) + if err != nil { + return totalSessionsCleaned, fmt.Errorf("error cleaning up sessions: %w", err) + } + totalSessionsCleaned += cnt + if cnt < 50 { + break + } } - - totalSessionsCleaned += cnt - if cnt < 50 { - break - } + return totalSessionsCleaned, nil } +// In main(): +slog.Info("Cleaning up any stale sessions") +totalSessionsCleaned, err := cleanupStaleSessions(sessionMgr) +if err != nil { + slog.Error("Error during session cleanup: %s", err) + return +} slog.Info("Cleaned up %d stale sessions", totalSessionsCleaned) ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 5 Why: This suggestion enhances code readability and maintainability. While beneficial, it's not addressing a critical issue or bug.
    5

    πŸ’‘ Need additional feedback ? start a PR chat

    mrheinen commented 1 day ago

    /add_docs

    codiumai-pr-agent-pro[bot] commented 1 day ago

    The docstring command only supports the following languages: python, java, cpp, javascript, typescript, jsx, tsx, csharp

    github-actions[bot] commented 1 day ago

    Generating Documentation...

    mrheinen commented 1 day ago

    /ask "Make suggestions based on compliance with the Golang style guide at https://golang.org/doc/effective_go.html"

    codiumai-pr-agent-pro[bot] commented 1 day ago

    Ask❓

    Make suggestions based on compliance with the Golang style guide at https://golang.org/doc/effective_go.html

    Answer:

    Based on the Golang style guide and the provided PR diff, here are some suggestions for improvement:

    1. Naming Conventions:

      • The SessionManager interface and DatabaseSessionManager struct follow the recommended naming conventions for interfaces and types.
      • Function names like CleanupStaleSessions, GetCachedSession, and UpdateCachedSession follow the MixedCaps naming convention as recommended.
    2. Error Handling:

      • The error handling in the new code follows the Go convention of returning errors as the last return value.
      • Consider using the errors.Wrap() function from the github.com/pkg/errors package to add more context to errors, especially in the DatabaseSessionManager methods.
    3. Comments:

      • The package-level comment for the session package is good, but consider adding more detailed comments for exported functions and types.
      • Add comments to explain the purpose of each method in the SessionManager interface.
    4. Concurrency:

      • The use of StringMapCache for managing active sessions is a good approach for concurrent access.
      • Consider using sync.RWMutex for any operations that don't modify the session data to allow multiple concurrent reads.
    5. Error Wrapping:

      • In functions like GetCachedSession and StartSession, consider wrapping errors with more context, e.g., fmt.Errorf("failed to fetch session: %w", err).
    6. Consistency:

      • Ensure consistent use of error message formatting. For example, in CleanupStaleSessions, use fmt.Errorf("error fetching session: %w", err) instead of string concatenation.
    7. Logging:

      • The use of slog for logging is good. Consider adding more log levels (e.g., Debug, Info) for better log management in production.
    8. Testing:

      • The addition of session_test.go is great. Ensure comprehensive test coverage for all new methods in the DatabaseSessionManager.
    9. Configuration:

      • Consider making the session timeout configurable through a configuration file or environment variable, rather than hardcoding it in the NewDatabaseSessionManager function.
    10. Documentation:

      • Add godoc comments to the SessionManager interface and DatabaseSessionManager struct to explain their purpose and usage.

    Overall, the new session management code follows many good Go practices. Implementing these suggestions will further improve its compliance with the Go style guide and make the code more maintainable and robust.

    github-actions[bot] commented 1 day ago

    Preparing answer...

    codiumai-pr-agent-pro[bot] commented 1 day ago

    CI Failure Feedback 🧐

    **Action:** build
    **Failed stage:** [Build](https://github.com/mrheinen/lophiid/actions/runs/11596668123/job/32288264490) [❌]
    **Failure summary:** The action failed due to a Go compilation error in the file pkg/util/string_map_cache.go.
    Specifically:
  • The error occurred on lines 41 and 42 of the file.
  • The compiler reported that "T does not satisfy comparable" in both instances.
  • This suggests that a generic type T is being used in a context where it needs to be comparable
    (e.g., as a map key), but the type constraint doesn't ensure that T is comparable.
  • The error prevented the successful compilation of the pkg/util:util_test target.
  • Relevant error logs: ```yaml 1: ##[group]Operating System 2: Ubuntu ... 569: Analyzing: 39 targets (249 packages loaded, 11087 targets configured) 570: [8 / 30] GoToolchainBinaryBuild external/go_sdk/builder [for tool]; 19s linux-sandbox 571: Analyzing: 39 targets (286 packages loaded, 11604 targets configured) 572: [38 / 198] GoToolchainBinaryBuild external/go_sdk/builder [for tool]; 20s linux-sandbox 573: INFO: Analyzed 39 targets (299 packages loaded, 11750 targets configured). 574: [76 / 347] GoStdlib external/io_bazel_rules_go/stdlib_/pkg; 0s linux-sandbox 575: [76 / 347] GoStdlib external/io_bazel_rules_go/stdlib_/pkg; 10s linux-sandbox 576: [77 / 347] [Prepa] GoCompilePkg pkg/util/constants/constants.a ... (3 actions, 0 running) 577: ERROR: /home/runner/work/lophiid/lophiid/pkg/util/BUILD.bazel:20:8: GoCompilePkg pkg/util/util_test.internal.a failed: (Exit 1): builder failed: error executing GoCompilePkg command (from target //pkg/util:util_test) bazel-out/k8-opt-exec-ST-d57f47055a04/bin/external/go_sdk/builder_reset/builder compilepkg -sdk external/go_sdk -installsuffix linux_amd64 -src pkg/util/command.go -src pkg/util/contains.go -src ... (remaining 45 arguments skipped) 578: Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging 579: ##[error]pkg/util/string_map_cache.go:41:83: T does not satisfy comparable 580: ##[error]pkg/util/string_map_cache.go:42:25: T does not satisfy comparable 581: compilepkg: error running subcommand external/go_sdk/pkg/tool/linux_amd64/compile: exit status 2 582: Use --verbose_failures to see the command lines of failed build steps. 583: INFO: Elapsed time: 53.960s, Critical Path: 37.83s 584: INFO: 90 processes: 62 internal, 28 linux-sandbox. 585: ERROR: Build did NOT complete successfully 586: ##[error]Process completed with exit code 1. ```

    ✨ CI feedback usage guide:
    The CI feedback tool (`/checks)` automatically triggers when a PR has a failed check. The tool analyzes the failed checks and provides several feedbacks: - Failed stage - Failed test name - Failure summary - Relevant error logs In addition to being automatically triggered, the tool can also be invoked manually by commenting on a PR: ``` /checks "https://github.com/{repo_name}/actions/runs/{run_number}/job/{job_number}" ``` where `{repo_name}` is the name of the repository, `{run_number}` is the run number of the failed check, and `{job_number}` is the job number of the failed check. #### Configuration options - `enable_auto_checks_feedback` - if set to true, the tool will automatically provide feedback when a check is failed. Default is true. - `excluded_checks_list` - a list of checks to exclude from the feedback, for example: ["check1", "check2"]. Default is an empty list. - `enable_help_text` - if set to true, the tool will provide a help message with the feedback. Default is true. - `persistent_comment` - if set to true, the tool will overwrite a previous checks comment with the new feedback. Default is true. - `final_update_message` - if `persistent_comment` is true and updating a previous checks message, the tool will also create a new message: "Persistent checks updated to latest commit". Default is true. See more information about the `checks` tool in the [docs](https://pr-agent-docs.codium.ai/tools/ci_feedback/).