nico-abram / blondie

Collect CPU callstack samples from a windows process
MIT License
20 stars 4 forks source link

Running blonide multiple times results in filename collision #68

Open MingweiSamuel opened 1 month ago

MingweiSamuel commented 1 month ago

Minimal example:

use std::process::Command;

fn main() {
    let handle = std::thread::spawn(|| {
        let mut cmd = Command::new("ping");
        cmd.arg("localhost");
        let _ctx = blondie::trace_command(cmd, false).unwrap();
    });

    let mut cmd = Command::new("ping");
    cmd.arg("localhost");
    let _ctx = blondie::trace_command(cmd, false).unwrap();

    handle.join().unwrap();
}

Output

$ cargo run
    Blocking waiting for file lock on build directory
   Compiling ring v0.17.8
   Compiling rustls v0.21.12
   Compiling sct v0.7.1
   Compiling rustls-webpki v0.101.7
   Compiling tokio-rustls v0.24.1
   Compiling hyper-rustls v0.24.2
   Compiling reqwest v0.11.27
   Compiling symsrv v0.2.0
   Compiling blondie v0.5.2
   Compiling blondie-test v0.1.0 (D:\Projects\blondie-test)
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 9.69s
     Running `target\debug\blondie-test.exe`
thread 'main' panicked at src/main.rs:12:51:
called `Result::unwrap()` on an `Err` value: Other(WIN32_ERROR(183), "Cannot create a file when that file already exists.\r\n", "StartTraceA")
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
error: process didn't exit successfully: `target\debug\blondie-test.exe` (exit code: 101)
nico-abram commented 1 month ago

StartTraceA is what's returning error code 183 (File already exists). From the docs, this means:

ERROR_ALREADY_EXISTS

A session with the same name or GUID is already running.

We are using a session name, specifically KERNEL_LOGGER_NAME There's a comment that explains why we use it:


    // Build the trace properties, we want EVENT_TRACE_FLAG_PROFILE for the "SampledProfile" event
    // https://docs.microsoft.com/en-us/windows/win32/etw/sampledprofile
    // In https://docs.microsoft.com/en-us/windows/win32/etw/event-tracing-mof-classes that event is listed as a "kernel event"
    // And https://docs.microsoft.com/en-us/windows/win32/etw/nt-kernel-logger-constants says
    // "The NT Kernel Logger session is the only session that can accept events from kernel event providers."
    // Therefore we must use GUID SystemTraceControlGuid/KERNEL_LOGGER_NAME as the session

Brieefly thinking about it, maybe we could store the session globally somewhere with refcounting to know when to close it. And then we'd need to modify the event_record_callback since it currently ignores all events except the ones for the target process, probably with some global array of process ids to filter. And then somehow correectly route the results to each trace.

The global session check would be around here and the event filtering here

MingweiSamuel commented 4 weeks ago

Seems like Block until processing thread is done for a specific process would have to be different if the session continues

Edit: I guess it is fine:

(Safeguard to make sure we don't deallocate the context before the other thread finishes using it)

MingweiSamuel commented 4 weeks ago

Where does the const EVENT_TRACE_TYPE_LOAD: u8 = 10; magic opcode number come from?

Edit: oh how obscure [EventType(10, 2, 3, 4), EventTypeName("Load", "Unload", "DCStart", "DCEnd")] https://learn.microsoft.com/en-us/windows/win32/etw/image-load#syntax