feat: use flag to control how to use `block_in_place`

Previously, block_in_place was used by default to perform user behavior, just for convenience, but in practice, most functions do not require this operation, now a flag is added to let the user control this behavior.

There is a lot of block_in_place behaviors, such as judging and setting status, opening new threads, etc. These are additional overheads, and if the user can control whether this behavior is needed or not, it is a big improvement to overall performance.

I used a simple demo to test the difference between the two:

use criterion::{criterion_group, criterion_main, Bencher, Criterion};
use std::collections::HashMap;

fn blocking(rt: &mut tokio::runtime::Runtime, hash_time: usize) {
    rt.block_on(async move {
        let mut handles = Vec::new();
        for _ in 0..256 {
            let handle = tokio::spawn(async move {
                tokio::task::block_in_place(move || {
                    let a = 1 + 1;
                    let b: HashMap<usize, usize> = HashMap::new();
                    for _ in 0..hash_time {
                        b.contains_key(&a);
                    }
                })
            });
            handles.push(handle);
        }
        for handle in handles {
            handle.await;
        }
    })
}

fn no_block(rt: &mut tokio::runtime::Runtime, hash_time: usize) {
    rt.block_on(async move {
        let mut handles = Vec::new();
        for _ in 0..256 {
            let _ = 8u8 & 20u8;
            let handle = tokio::spawn(async move {
                let a = 1 + 1;
                let b: HashMap<usize, usize> = HashMap::new();
                for _ in 0..hash_time {
                    b.contains_key(&a);
                }
            });
            handles.push(handle);
        }
        for handle in handles {
            handle.await;
        }
    })
}

fn test_block(
    bench: &mut Bencher,
    rt: &mut tokio::runtime::Runtime,
    block: bool,
    hash_time: usize,
) {
    bench.iter(|| {
        if block {
            blocking(rt, hash_time)
        } else {
            no_block(rt, hash_time)
        }
    })
}

fn criterion_benchmark(bench: &mut Criterion) {
    let mut rt = tokio::runtime::Runtime::new().unwrap();
    bench.bench_function("blocking and hash 1 time", {
        |b| test_block(b, &mut rt, true, 1)
    });
    bench.bench_function("no blocking and hash 1 time", {
        |b| test_block(b, &mut rt, false, 1)
    });

    bench.bench_function("blocking and hash 1000 time", {
        |b| test_block(b, &mut rt, true, 1000)
    });
    bench.bench_function("no blocking and hash 1000 time", {
        |b| test_block(b, &mut rt, false, 1000)
    });

    bench.bench_function("blocking and hash 10000 time", {
        |b| test_block(b, &mut rt, true, 10000)
    });
    bench.bench_function("no blocking and hash 10000 time", {
        |b| test_block(b, &mut rt, false, 10000)
    });
}

criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);

result:

blocking and hash 1 time time:   [1.2777 ms 1.2941 ms 1.3143 ms]

no blocking and hash 1 time                                                                            
                        time:   [165.12 us 166.46 us 167.86 us]

blocking and hash 1000 time                                                                             
                        time:   [1.5778 ms 1.6277 ms 1.6815 ms]

no blocking and hash 1000 time                                                                             
                        time:   [1.3526 ms 1.3867 ms 1.4255 ms]

blocking and hash 10000 time                                                                             
                        time:   [11.069 ms 11.164 ms 11.256 ms]

no blocking and hash 10000 time                                                                             
                        time:   [11.824 ms 11.939 ms 12.066 ms]

There's a fixed performance loss of nearly an order of magnitude in a simple scene. Of course, this can't be blamed on the implementation, as there really is a lot of checking and status switching required inside, but it would be a good idea if users could choose whether or not to use this function based on their own behavior

nervosnetwork / tentacle

feat: use flag to control how to use `block_in_place` #226