eunomia-bpf / GPTtrace

Generate eBPF programs and tracing with ChatGPT
https://eunomia.dev/GPTtrace/
MIT License
226 stars 21 forks source link

[feature] Allow tracking with bcc tools via openai function call. #8

Closed try-agaaain closed 1 year ago

try-agaaain commented 1 year ago

Description

This PR adds a command-line parameter -b(--bcc), when the user uses this parameter, GPTtrace will select the appropriate command tool from bcc-bpftools and set the appropriate parameters to complete the tracking task, like this:

$./GPTtrace.py -v -b "count kernel stack traces for submit_bio"
Run:  sudo stackcount-bpfcc submit_bio 
Tracing 1 functions for "submit_bio"... Hit Ctrl-C to end.
^C
  b'submit_bio'
  b'ext4_io_submit'
  b'ext4_bio_write_page'
  b'mpage_submit_page'
  b'mpage_process_page_bufs'
  b'mpage_prepare_extent_to_map'
  b'ext4_writepages'
  b'do_writepages'
  b'__writeback_single_inode'
  b'writeback_sb_inodes'
  b'__writeback_inodes_wb'
  b'wb_writeback'
  b'wb_workfn'
  b'process_one_work'
  b'worker_thread'
  b'kthread'
  b'ret_from_fork'
    2
try-agaaain commented 1 year ago

Problems that need to be solved

Here are some issues that I will fix later:

try-agaaain commented 1 year ago

For the first problem, it is done in two steps:

For the third problem, I created a dictionary of positional parameters to determine which parameters of a given command are positional parameters, see https://github.com/try-agaaain/GPTtrace/blob/8b7c3eecdfeec04f0eb3eedba2e3526465374023/bcc_tools.py#L90

yunwei37 commented 1 year ago

why not generate the function call dynamically at runtime? the advantages and disadvantages?

And,maybe you can try combined this with agents in langchain later? https://docs.langchain.com/docs/components/agents/

try-agaaain commented 1 year ago

If I need to generate a function call dynamically, I first need to know which command to generate the function call, how should I determine this?

Dynamic generation has some drawbacks, as the function calls generated by LLM may not always be accurate. If we can predefine these function calls more accurately, it can reduce the occurrence of errors.

It doesn't seem to be more convenient to use Function Call in LangChain, but it's okay to change to LangChain.

try-agaaain commented 1 year ago

Does it possible to merge this PR first? It already includes a lot of updates (might be a bit messy). @yunwei37 @Officeyutong

I have added a -c option where users can use GPTtrace -c opensnoop-bpfcc "help me trace the open syscall in pid 123" to utilize opensnoop-bpfcc. For example:

```console
$./GPTtrace.py -c memleak-bpfcc "Trace allocations and display each individual allocator function call"
 Run:  sudo memleak-bpfcc --trace 
Attaching to kernel allocators, Ctrl+C to quit.
(b'Relay(35)', 402, 6, b'd...1', 20299.252425, b'alloc exited, size = 4096, result = ffff8881009cc000')
(b'Relay(35)', 402, 6, b'd...1', 20299.252425, b'free entered, address = ffff8881009cc000, size = 4096')
(b'Relay(35)', 402, 6, b'd...1', 20299.252426, b'free entered, address = 588a6f, size = 4096')
(b'Relay(35)', 402, 6, b'd...1', 20299.252427, b'alloc entered, size = 4096')
(b'Relay(35)', 402, 6, b'd...1', 20299.252427, b'alloc exited, size = 4096, result = ffff8881009cc000')
(b'Relay(35)', 402, 6, b'd...1', 20299.252428, b'free entered, address = ffff8881009cc000, size = 4096')
(b'sudo', 6938, 10, b'd...1', 20299.252437, b'alloc entered, size = 2048')
(b'sudo', 6938, 10, b'd...1', 20299.252439, b'alloc exited, size = 2048, result = ffff88822e845800')
(b'node', 410, 18, b'd...1', 20299.252455, b'alloc entered, size = 256')
(b'node', 410, 18, b'd...1', 20299.252457, b'alloc exited, size = 256, result = ffff8882e9b66400')
(b'node', 410, 18, b'd...1', 20299.252458, b'alloc entered, size = 2048')

For bcc tools, it looks up the corresponding function call from funcs.json. For other tools not defined in funcs.json, the LLM dynamically generates the function call.

In the next PR, I will attempt to accomplish the tracing task using a method similar to autogpt. #10

Officeyutong commented 1 year ago

Some suggestions:

try-agaaain commented 1 year ago

Some suggestions:

Great suggestion! In the last submission, I added documentation for each function. The descriptions in the documentation may not be detailed enough, but it will be improved in the future.