solana-labs / rbpf

Rust virtual machine and JIT compiler for eBPF programs
Apache License 2.0
277 stars 168 forks source link

generated dot files are unusable #222

Open jorgeelmundoso opened 2 years ago

jorgeelmundoso commented 2 years ago

generated dot files of even the most simple solana programs are completely unusable:

example of a control flow graph:

cfg

Lichtso commented 2 years ago

Honestly, I am surprised that it generated any output at all. Usually, one would only feed in a few methods at a time, not disassemble the entire program graphically. Also, there are many dot file engines with different parametrizations and getting such big inputs to be even somewhat readable is always an exercise in tuning.

Anyway, you tried it, so I guess there is a use-case beyond curiosity for you. Maybe you can go a bit deeper into what you want to achieve? Then we might be able to find a workable solution or plan future features accordingly.

jorgeelmundoso commented 2 years ago

how I did it

the graph above was generated using following command

rbpf-cli --use cfg target/deploy/as_simple_as_it_gets.so

how would you do it?

based on the bpfi-cli usage how would you feed in only a few methods? Maybe you could generate an asm file and manually modify it to extract a subset of the solana program. But based on the documentation that isn't clear.

USAGE:
    rbpf-cli [OPTIONS] <PROGRAM>

ARGS:
    <PROGRAM>    Program file to use. This is either an ELF shared-object file to be executed,
                 or an assembly file to be assembled and executed.

OPTIONS:
    -h, --help                    Print help information
    -i, --input <FILE / BYTES>    Input for the program to run on, where FILE is a name of a JSON
                                  file with input data, or BYTES is the number of 0-valued bytes to
                                  allocate for program parameters [default: 0]
    -l, --limit <COUNT>           Limit the number of instructions to execute [default:
                                  9223372036854775807]
    -m, --memory <BYTES>          Heap memory for the program to run on [default: 0]
    -p, --profile                 Output profile to 'profile.dot' file using tracing instrumentation
    -t, --trace                   Output trace to 'trace.out' file using tracing instrumentation
    -u, --use <VALUE>             Method of execution to use, where 'cfg' generates Control Flow
                                  Graph of the program, 'disassembler' dumps disassembled code of
                                  the program, 'interpreter' runs the program in the virtual
                                  machine's interpreter, and 'jit' precompiles the program to native
                                  machine code before execting it in the virtual machine. [default:
                                  jit] [possible values: cfg, disassembler, interpreter, jit]
    -v, --verify                  Run the verifier before execution or disassembly
    -V, --version                 Print version information

use-case

Have a look at https://go.dev/blog/pprof.

While that example can't be translated 1:1 to solana programs, the goal is usually to visualise control flow and identify issues in your program that you wouldn't find by just thinking about it or adding fprintf() statements to the code.

I am not sure to what extend rbpf emulates the operation that occurs within the runtime in a solana cluster, but since the solana runtime has a uniq set of rules, optimisation and debugging of solana programs will benefit from a good profiler.

Lichtso commented 2 years ago

I am not sure to what extend rbpf emulates the operation that occurs within the runtime in a solana cluster

This crate (RBPF) does not emulate the solana program runtime, it is what the solana program runtime depends on. However, the rbpf-cli here is a standalone tool for development purposes. There is another one here which is similar but a bit better integrated into the program runtime and e.g. allows for account inputs to be specified.

optimization and debugging of solana programs will benefit from a good profiler.

Yes, we are definitely lacking good tooling on this end. Currently there is no way to restrict the DOT output to specific functions or to only show the profiling at function level (not at instruction level). What you can do right now is manually either cut parts from the input binary file, or the output dot file.

Do you just need any kind of profile (would a textual trace suffice?) or do you specifically want to see the heat / color coded CFG visually?

jorgeelmundoso commented 2 years ago

I am using https://github.com/solana-labs/solana/tree/master/rbpf-cli but if you look at https://github.com/solana-labs/solana/blob/3c5f505d3e3ded8b3110b44bc458d01701b93e4d/rbpf-cli/Cargo.toml#L19 you see it pulls this repo as a dependency, therefor I opened the issue here.

I think what I am really looking for would be: