mrheinen / lophiid

A distributed honeypot for monitoring large scale web attacks
GNU General Public License v2.0
6 stars 1 forks source link

Add decoding of unicode payloads in UI and backend #46

Closed mrheinen closed 1 month ago

mrheinen commented 1 month ago

User description

Also store the decoded data in the database and make all metadata (from all decoders) available in the Javascript content scripts.


PR Type

Enhancement, Documentation


Description


Changes walkthrough ๐Ÿ“

Relevant files
Enhancement
13 files
server.go
Integrate ExtractorCollection in API server                           

pkg/api/server.go
  • Added import for extractors package
  • Modified HandleUpsertSingleContent function to include
    ExtractorCollection
  • Updated RunScript call to include ExtractorCollection parameter
  • +3/-1     
    backend.go
    Implement ExtractorCollection in backend processing           

    pkg/backend/backend.go
  • Added ExtractorCollection to ReqQueueEntry struct
  • Modified HandleProbe function to use ExtractorCollection
  • Updated ProcessRequest function to include ExtractorCollection
    parameter
  • Refactored metadata extraction and processing logic
  • +38/-51 
    base64_extractor.go
    Refactor Base64Extractor for consistency                                 

    pkg/backend/extractors/base64_extractor.go
  • Moved isASCII function to util package
  • Updated import statements
  • Changed metaType constant to use constants package
  • +4/-13   
    extractor_collection.go
    Add ExtractorCollection for centralized extraction             

    pkg/backend/extractors/extractor_collection.go
  • Implemented ExtractorCollection struct and methods
  • Added functions for parsing requests and iterating metadata
  • +68/-0   
    unicode_extractor.go
    Implement UnicodeExtractor for decoding Unicode strings   

    pkg/backend/extractors/unicode_extractor.go
  • Implemented UnicodeExtractor struct and methods
  • Added functions for decoding and extracting Unicode strings
  • +120/-0 
    goja.go
    Integrate ExtractorCollection in JavaScript runner             

    pkg/javascript/goja.go
  • Added ExtractorCollection parameter to RunScript method
  • Updated Util struct to include Context field
  • Modified JavascriptRunner interface
  • +8/-4     
    wrappers.go
    Add RequestContext for metadata access in JavaScript         

    pkg/javascript/wrappers.go
  • Added RequestContext struct
  • Implemented methods for accessing request metadata
  • +22/-0   
    shared_constants.go
    Add extractor type constants                                                         

    pkg/util/constants/shared_constants.go - Added constants for extractor types
    +9/-0     
    general.go
    Add utility function for ASCII string check                           

    pkg/util/general.go - Added IsStringASCII function
    +11/-0   
    RawHttpCard.vue
    Add Unicode decoding feature to RawHttpCard                           

    ui/src/components/cards/RawHttpCard.vue
  • Added Unicode decoding functionality
  • Updated template to include Unicode decode button
  • +13/-5   
    RequestView.vue
    Implement Unicode metadata display in RequestView               

    ui/src/components/container/RequestView.vue
  • Added support for displaying decoded Unicode strings
  • Updated metadata processing logic
  • +15/-0   
    helpers.js
    Add Unicode decoding helper function                                         

    ui/src/helpers.js - Added decodeUnicodeString function
    +10/-0   
    database.sql
    Update database schema for Unicode metadata                           

    config/database.sql - Added 'DECODED_STRING_UNICODE' to METADATA_TYPE enum
    +1/-1     

    ๐Ÿ’ก PR-Agent usage: Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    github-actions[bot] commented 1 month ago

    Preparing PR description...

    github-actions[bot] commented 1 month ago

    Preparing review...

    codiumai-pr-agent-pro[bot] commented 1 month ago

    PR Reviewer Guide ๐Ÿ”

    โฑ๏ธ Estimated effort to review: 4 ๐Ÿ”ต๐Ÿ”ต๐Ÿ”ต๐Ÿ”ตโšช
    ๐Ÿงช PR contains tests
    ๐Ÿ”’ No security concerns identified
    โšก Key issues to review

    Code Refactoring
    The HandleProbe function has been significantly modified to include ExtractorCollection. This change might affect the overall flow and error handling of the function. A thorough review is needed to ensure all edge cases are properly handled. API Change
    The RunScript method signature has been updated to include an ExtractorCollection parameter. This change might affect all callers of this method and could potentially break existing code if not properly updated. UI Update
    New UI elements have been added to display decoded Unicode strings. This change should be tested thoroughly to ensure proper rendering and handling of Unicode data.
    codiumai-pr-agent-pro[bot] commented 1 month ago

    PR Code Suggestions โœจ

    CategorySuggestion                                                                                                                                    Score
    Possible issue
    Add error handling for Unicode decoding to prevent unhandled exceptions ___ **Consider using a try-catch block to handle potential errors when parsing Unicode
    escape sequences, ensuring the function doesn't throw unhandled exceptions for
    invalid input.** [ui/src/helpers.js [40-47]](https://github.com/mrheinen/lophiid/pull/46/files#diff-4ada3d48db134b88e37827a22e85ee31cda322192b5816cfca0f4f67745ad046R40-R47) ```diff export function decodeUnicodeString(encodedString) { const unicodeRegex = /\\u([0-9a-fA-F]{4})/g; const decodedString = encodedString.replace(unicodeRegex, (match, codePoint) => { - return String.fromCharCode(parseInt(codePoint, 16)); + try { + return String.fromCharCode(parseInt(codePoint, 16)); + } catch (error) { + console.warn(`Failed to decode Unicode sequence: ${match}`, error); + return match; + } }); return decodedString; } ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 9 Why: This suggestion significantly improves the robustness of the decodeUnicodeString function by handling potential errors, preventing crashes and improving user experience.
    9
    Enhancement
    โœ… Add test cases for non-ASCII and multi-byte Unicode characters ___
    Suggestion Impact:The commit added a test case for non-ASCII Unicode characters, which was part of the suggestion code diff: ```diff + { + description: "non-ASCII Unicode character", + input: `Hello \u03A9 World`, + result: map[string]string{ + `\u03A9`: "ฮฉ", + }, + }, ```
    ___ **Consider adding test cases for non-ASCII Unicode characters and multi-byte sequences
    to ensure the extractor handles a wider range of Unicode scenarios correctly.** [pkg/backend/extractors/unicode_extractor_test.go [18-31]](https://github.com/mrheinen/lophiid/pull/46/files#diff-b7b3f7e629697c3886fdbcfdab9eeb4950b0ad93eae98d3a1ce1a5bef82427f1R18-R31) ```diff { description: "find two strings", input: `aaa \u0061 aa \u0062 aaa`, result: map[string]string{ `\u0061`: "a", `\u0062`: "b", }, }, { description: "find one strings", input: `\u0061\u0062\u0061\u0062\u0061\u0062`, result: map[string]string{ `\u0061\u0062\u0061\u0062\u0061\u0062`: "ababab", }, }, +{ + description: "non-ASCII Unicode character", + input: `Hello \u03A9 World`, + result: map[string]string{ + `\u03A9`: "ฮฉ", + }, +}, +{ + description: "multi-byte Unicode sequence", + input: `\u1F600 Grinning Face`, + result: map[string]string{ + `\u1F600`: "๐Ÿ˜€", + }, +}, ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 8 Why: This suggestion improves test coverage by including more complex Unicode scenarios, which is crucial for ensuring the extractor's robustness and reliability.
    8
    Simplify struct initialization by directly passing the parameter to the nested struct ___ **Consider initializing the RequestContext struct with the eCol parameter directly in
    the Util struct initialization, rather than creating a separate Context field.** [pkg/javascript/goja.go [92-99]](https://github.com/mrheinen/lophiid/pull/46/files#diff-93ebbcb2bc5cc6ca7ef6d25c1ce739d50fc1194101ed2b4469c1941da4111828R92-R99) ```diff Util{ Encoding: Encoding{}, Database: DatabaseClientWrapper{ dbClient: j.dbClient, }, Runner: CommandRunnerWrapper{ allowedCommands: j.allowedCommands, commandTimeout: j.commandTimeout, }, - Context: RequestContext{ - eCol: eCol, - }, + Context: RequestContext(eCol), } ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 5 Why: The suggestion offers a minor improvement in code readability, but doesn't significantly impact functionality or performance.
    5
    Error handling
    Add error handling to improve robustness of the parsing process ___ **Consider adding error handling for the ParseRequest method to propagate any errors
    that might occur during extraction.** [pkg/backend/extractors/extractor_collection.go [40-44]](https://github.com/mrheinen/lophiid/pull/46/files#diff-f5b9d63dc9e9fbe60a7ecd47c6e041fab450e8c704ac04f661a539d0064d927aR40-R44) ```diff -func (a *ExtractorCollection) ParseRequest(req *database.Request) { +func (a *ExtractorCollection) ParseRequest(req *database.Request) error { for _, ex := range a.extractors { - ex.ParseRequest(req) + if err := ex.ParseRequest(req); err != nil { + return fmt.Errorf("error parsing request: %w", err) + } } + return nil } ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 8 Why: This suggestion significantly improves error handling, which is crucial for maintaining robust and reliable code, especially in parsing operations.
    8
    Performance
    Use a more memory-efficient data structure for storing unique strings ___ **Consider using a more efficient data structure for storing unique strings, such as
    map[string]struct{}, instead of map[string]int for tcpAddressesMap.** [pkg/backend/extractors/extractor_collection.go [11-17]](https://github.com/mrheinen/lophiid/pull/46/files#diff-f5b9d63dc9e9fbe60a7ecd47c6e041fab450e8c704ac04f661a539d0064d927aR11-R17) ```diff type ExtractorCollection struct { linksMap map[string]struct{} - tcpAddressesMap map[string]int + tcpAddressesMap map[string]struct{} b64Map map[string][]byte uniMap map[string]string extractors []Extractor } ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 7 Why: This suggestion can lead to improved memory usage, which is beneficial for performance, especially when dealing with large datasets.
    7
    Optimize ASCII string check by iterating over bytes instead of using unicode package ___ **Consider using a more efficient approach to check if a string is ASCII by iterating
    over bytes instead of runes, which can improve performance for large strings.** [pkg/util/general.go [50-57]](https://github.com/mrheinen/lophiid/pull/46/files#diff-0664544a48f58f335cd90e8a9bd6e8e5a60d8291eef87c17903fc32de04025eeR50-R57) ```diff func IsStringASCII(s string) bool { for i := 0; i < len(s); i++ { - if s[i] > unicode.MaxASCII { + if s[i] > 127 { return false } } return true } ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 7 Why: This optimization can improve performance for large strings, which is beneficial. However, the impact may be minimal for smaller strings, and the current implementation is already correct.
    7
    Optimize the display of decoded Unicode strings by using v-if instead of v-for ___ **Consider using a v-if directive instead of v-for when displaying a single decoded
    Unicode string to improve performance and readability.** [ui/src/components/container/RequestView.vue [84-91]](https://github.com/mrheinen/lophiid/pull/46/files#diff-4d29d6da813e108a78b8c3f2146d26b6e1c072e931e977bbc78e01208ba282b4R84-R91) ```diff
    -
    - +
    +
    ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 5 Why: While this change might slightly improve performance, it assumes there's only one Unicode string to display, which may not always be the case. The improvement is minor and potentially limiting.
    5
    Best practice
    Use a more flexible type for storing extractors to improve extensibility ___ **Consider using a slice of interfaces instead of concrete types for extractors to
    improve flexibility and extensibility of the ExtractorCollection struct.** [pkg/backend/extractors/extractor_collection.go [11-17]](https://github.com/mrheinen/lophiid/pull/46/files#diff-f5b9d63dc9e9fbe60a7ecd47c6e041fab450e8c704ac04f661a539d0064d927aR11-R17) ```diff type ExtractorCollection struct { linksMap map[string]struct{} tcpAddressesMap map[string]int b64Map map[string][]byte uniMap map[string]string - extractors []Extractor + extractors []interface{ Extractor } } ``` - [ ] **Apply this suggestion**
    Suggestion importance[1-10]: 6 Why: The suggestion improves code flexibility and extensibility, which can be beneficial for future development, but may not have immediate impact.
    6

    ๐Ÿ’ก Need additional feedback ? start a PR chat