LiamBindle / MQTT-C

A portable MQTT C client for embedded systems and PCs alike.
https://liambindle.ca/MQTT-C
MIT License
775 stars 275 forks source link

Signed integer overflow in `mqtt_error_str()` #176

Open brianrho opened 1 year ago

brianrho commented 1 year ago

The issue is reproduced as follows:

It happens because a signed overflow occurs during this subtraction:

const char* mqtt_error_str(enum MQTTErrors error) {
    int offset = error - MQTT_ERROR_UNKNOWN;

MQTT_OK == 1, MQTT_ERROR_UNKNOWN == INT_MIN.

1 - INT_MIN is undefined, for the current 32-bit-int representation. In fact, that goes for any number greater than or equal to 0, as explained here.

brianrho commented 1 year ago

The issue isn't noticed normally likely because the compiler reproduces the entire IF statement pretty much as is. Whenever 1 - INT_MIN happens, instead of yielding 2147483649 (which is not possible in the 32-bit signed int), it calmly wraps around and yields -2147483647. Your current logic relies on that negative value, to skip the first clause of the IF block when error == MQTT_OK.

But when optimization is enabled, the compiler does away with the whole subtraction and IF block, because INT_MIN is pretty special and so the compiler can make certain sweeping assumptions here. This is what GCC produces, along with my shaky interpretation:

0800397c <mqtt_error_str>:
    } else if (error > 0) {
        return "MQTT_OK";
    } else {
        return MQTT_ERRORS_STR[0];
    }
}

# load MQTT_ERRORS_STR ptr into r3; r0 already holds the first argument, error
 800397c:   4b01        ldr r3, [pc, #4]    ; (8003984 <mqtt_error_str+0x8>)

# Now, r0 = *(r3 + (r0 * 4))
 800397e:   f853 0020   ldr.w   r0, [r3, r0, lsl #2]

# return to caller; r0 holds the return value/pointer
 8003982:   4770        bx  lr

# pointer to MQTT_ERRORS_STR array
 8003984:   08019378    .word   0x08019378

In other words, in C, the function has been transformed into:

const char* mqtt_error_str(enum MQTTErrors error) {
    return MQTT_ERRORS_STR[error];
}

OR, more precisely with pointer arithmetic:

const char* mqtt_error_str(enum MQTTErrors error) {
    return *(const char **)((uint32)MQTT_ERRORS_STR + (uint32)(error * 4));
}

... which should be functionally equivalent to the original ... so long as nobody was depending on certain signed overflows.

brianrho commented 1 year ago

When I modify the function like so:

const char* mqtt_error_str(enum MQTTErrors error) {
    if (error < 0) {
        /* Only allow this subtraction when error is negative, to avoid overflow */
        return MQTT_ERRORS_STR[error - MQTT_ERROR_UNKNOWN];
    } else if (error == 0) {
        return "MQTT_ERROR: Buffer too small.";
    } else {
        return "MQTT_OK";
    }
}

... everything's fine now, because we force the compiler to acknowledge the possibility that the error argument can be >= 0 and still yield valid results from this function, that are outside the MQTT_ERRORS_STR array.

So it preserves the IF statement, more or less as is (though I didn't check). As a result, we also don't perform any subtraction unless we're certain there won't be an overflow.