rizinorg / rizin

UNIX-like reverse engineering framework and command-line toolset.
https://rizin.re
GNU Lesser General Public License v3.0
2.7k stars 361 forks source link

RzBitVector Performance Improvements #3492

Open brightprogrammer opened 1 year ago

brightprogrammer commented 1 year ago

Issue RzBitVector at the moment performs lots of bitwise OR operations in a loop to convert from RzBitVector to numbers, say, ut8 or ut32 etc... This can have some performance impact when RzBitVector is being used in another loop like in a VM where all registers are using RzBitVector.

/**
 * Convert bitv to a ut8 value
 * \param x BitVector
 * \return  ut8 value
 */
RZ_API ut8 rz_bv_to_ut8(RZ_NONNULL const RzBitVector *x) {
    rz_return_val_if_fail(x, 0);
    if (x->len <= 64) {
        return (ut8)x->bits.small_u & UT8_MAX;
    }
    ut8 ret = 0; 
    for (ut32 i = 0; i < x->len && i < 8; ++i) {   // <---------- this part
        if (rz_bv_get(x, i)) { 
            ret |= 1 << i;
        }
    }
    return ret;
}

or

/**
 * Convert RzBitVector to ut64
 * \param x RzBitVector, pointer to the bitvector
 * \return ret ut64, num value of bitvector
 */
RZ_API ut64 rz_bv_to_ut64(RZ_NONNULL const RzBitVector *x) {
    rz_return_val_if_fail(x, 0);
    if (x->len <= 64) {
        return x->bits.small_u;
    }
    ut64 ret = 0;
    for (ut32 i = 0; i < x->len && i < 64; ++i) { // <---------- this part
        if (rz_bv_get(x, i)) {
            ret |= 1ULL << i;
        }
    }
    return ret;
}

Suggested Solution Instead of taking bitwise OR in a loop, we can directly OR two 8 bit values at once since in a single 8 bit value, the ordering is same everywhere and whatever ordering change there is, is because of arrangement of an array of these 8 bit values in memory.

This would mean changing implementations to something like :

/**
 * Convert RzBitVector to ut64
 * \param x RzBitVector, pointer to the bitvector
 * \return ret ut64, num value of bitvector
 */
RZ_API ut64 rz_bv_to_ut64(RZ_NONNULL const RzBitVector *x) {
    rz_return_val_if_fail(x, 0);
    if (x->len <= 64) {
        return x->bits.small_u;
    }
    ut64 ret = 0;
    for (ut32 i = 0; i < x->len/BV_ELEM_SIZE && i < 8; ++i) {
        res |= bv->bits.large_a[i] << i*BV_ELEM_SIZE;
    }
    return ret;
}

Suggested Alternative We can directly create an array of 64 bit values instead of 8 bit values and store numbers directly in these elements without checking for bitvector length. This would mean then changing the RzBitVector struct to have an array of ut64 values instead of ut8 values in the union. Then when storing values in RzBitVector, we will directly store 64 bit values and when converting the bitvector to one of the utXX types, we can directly return the first element in the array by typecasting it properly

XVilka commented 1 year ago

I suggest doing this only after https://github.com/rizinorg/rizin/issues/3455 is complete, it will make testing these optimizations easier as well.