Yamato-Security / hayabusa

Hayabusa (隼) is a sigma-based threat hunting and fast forensics timeline generator for Windows event logs.
GNU Affero General Public License v3.0
2.26k stars 200 forks source link

Optimize according to ChatGPT advice? #960

Closed YamatoSecurity closed 1 year ago

YamatoSecurity commented 1 year ago

ChatGPTにHayabusaのソースコードで改善できるところがあるか聞いてみましたw: ザック: give me specific examples in the source code of where we can optimize for better code in github.com/Yamato-Security/hayabusa

Here are some specific examples of where you can optimize the code in the Hayabusa project:

  1. Use const where possible: In the file src/utils.rs, the constants NUM_CPUS and BYTES_PER_MB are used multiple times throughout the code. It would be more efficient to declare these as const variables at the top of the file and use them throughout the code.

  2. Use match instead of if chains: In the file src/config.rs, there are multiple nested if statements that could be simplified using Rust's match syntax. For example, the function parse_config_file could be rewritten using a match statement to handle different cases based on the result of parsing the file.

  3. Use Vec::with_capacity() for pre-allocating memory: In the file src/main.rs, the vector outputs is initialized with a default capacity of 1024. Since the exact size of the vector is not known ahead of time, it would be more efficient to use Vec::with_capacity() to pre-allocate memory for the vector based on an estimate of its eventual size.

  4. Use Box::new() to allocate heap memory: In the file src/worker.rs, the spawn_workers function creates a vector of Worker structs using the vec![] macro. This causes the Worker structs to be allocated on the stack, which can lead to stack overflow errors if too many workers are spawned. Instead, use Box::new() to allocate the structs on the heap.

  5. Use &str instead of String where possible: In the file src/utils.rs, the function escape_html takes a String argument and returns a new String. Since the argument is only used for reading and not modified, it would be more efficient to change the function signature to take a &str argument instead.

hitenkoku commented 1 year ago

それぞれの項目についてちょっと確認した結果を書いておきます

Use const where possible: In the file src/utils.rs, the constants NUM_CPUS and BYTES_PER_MB are used multiple times throughout the code. It would be more efficient to declare these as const variables at the top of the file and use them throughout the code.

src/utils.rsというものは存在していません。src/detections/utils.rsはあります。 とはいえ指定されているようなNUM_CPU, BYTES_PER_MBとかは存在しないので難しそうです。 何回も使われている固定のものをconstとするのは正しいですが、constとした段階で変更できないので取り扱いとしても難しいところです

Use match instead of if chains: In the file src/config.rs, there are multiple nested if statements that could be simplified using Rust's match syntax. For example, the function parse_config_file could be rewritten using a match statement to handle different cases based on the result of parsing the file.

src/configs.rsは存在しません(src/detections/configs.rsは存在する) 基本的にparse_config_fileという関数も存在していないです。とはいえifが連続で使うのであればmatchを使えという指摘なので、修正はできなくはなさそうです。match文で書くとすべてのケースを書く必要が出てくるので少し見づらくなるのではという懸念はあります

Use Vec::with_capacity() for pre-allocating memory: In the file src/main.rs, the vector outputs is initialized with a default capacity of 1024. Since the exact size of the vector is not known ahead of time, it would be more efficient to use Vec::with_capacity() to pre-allocate memory for the vector based on an estimate of its eventual size.

確かにmain.rsは存在しておりVecも使っているのでこの指摘は対応しても良いかもしれませんがそもそも初期サイズをどうするかというところも考える必要があるので固定でサイズを与えてしまってそれを超えるデータが来た場合拡張のためにパフォーマンスが低下する可能性があります。

Use Box::new() to allocate heap memory: In the file src/worker.rs, the spawn_workers function creates a vector of Worker structs using the vec![] macro. This causes the Worker structs to be allocated on the stack, which can lead to stack overflow errors if too many workers are spawned. Instead, use Box::new() to allocate the structs on the heap.

src/worker.rs自体存在していません

Use &str instead of String where possible: In the file src/utils.rs, the function escape_html takes a String argument and returns a new String. Since the argument is only used for reading and not modified, it would be more efficient to change the function signature to take a &str argument instead.

確かにStringを受けてStringで返している関数があるためそこは&strに修正しても良さそうです(といってもrefactoringの範疇かなとも思っています)

YamatoSecurity commented 1 year ago

@hitenkoku 調べて頂いて、ありがとうございます! また明日のMTGでどうするか話しましょう。