frank2 / exe-rs

The PE Executable Library, but for Rust!
GNU General Public License v3.0
71 stars 13 forks source link

BufferSearch search performance #2

Closed Thell closed 1 year ago

Thell commented 2 years ago

I've been using your library for some PE restructuring work as well as some extraction and wanted to report this so you would know...

    let text_di_bytes = pefile.get_buffer().as_ref();
    let tr_xrefs = troc_instances
        .iter()
        .map(|x| {
            let tr_offset_va = offset_to_va(*x);
            let tr_bytes: [u8; 4] = u32::to_le_bytes(tr_offset_va);
            // let mut tr_xref = memmem::find_iter(text_di_bytes, &tr_bytes);
            let mut tr_xref = pefile.search(tr_bytes).unwrap();
            let _ = tr_xref.next().unwrap();
            let pos = tr_xref.next().unwrap();
            pos.try_into().unwrap()
        })
        .collect::<Vec<u32>>();

The runtime of this is 59 seconds with the BufferSearch and with the only change being to use the memmem::find_iter iter instead the runtime is ~200ms.

frank2 commented 2 years ago

Thanks for reporting this and I apologize for the delay in getting to your report! This is an absolutely great find and I'll start working on integrating it into the project!

The search algorithm is a very naive O(N) binary search, there's undoubtedly a better algorithm. This change will require integrating into pkbuffer as well, since the core of the issue is the search algorithm.

I'm glad you're using the library! I'll make sure to fix this soon!

frank2 commented 1 year ago

Okay! This is fixed as of the newest version of PKBuffer (0.4.2). PKBuffer now uses the memchr library's two-way search algorithm for binary searching. Thanks again for reporting this! You can find a fixed commit here: https://github.com/frank2/exe-rs/commit/01320314381f9fcf9deb8867031146e285fe2097 As you can see in the Cargo file, this will be officially fixed with version 0.5.5.

frank2 commented 1 year ago

This is live in 0.5.6, not 0.5.5. Thanks again for reporting this!