accellera-official / systemc

SystemC Reference Implementation
https://systemc.org/overview/systemc/
Apache License 2.0
475 stars 152 forks source link

Fix undefined behaviour when assigning a negative value to an sc_int #41

Closed mrt-amd closed 9 months ago

mrt-amd commented 1 year ago

The previous code here attempted to perform a sign extension by shifting left and then right to fill the top bits with sign bit. If the input is negative then this applies a left shift to a negative value, which is undefined behaviour in C++17 and the error is caught by ubsan. Fix this by explicitly picking out the sign bit and then filling the top bits with that value.

Visible in ubsan. For example:

#include <systemc.h>

int sc_main(int argc, char *argv[])
{
  sc_int<8> foo = -42;
  return 0;
}

/usr/include/sysc/datatypes/int/sc_int_base.h:574:22: runtime error: left shift of negative value -42

This has no effect on performance under optimisation, as compilers are clever enough to recognise the sign extension and turn it into a simple extension or logical-shift-left, arithimetic-shift-right pair. For example:

#include <systemc.h>

sc_int<47> f32(int v)
{
  sc_int<47> x = v;
  return x;
}

sc_int<47> f64(int64 v)
{
  sc_int<47> x = v;
  return x;
}

Compiles with gcc 9 to:

00000000000011f0 <_Z3f32i>:
    11f0:       f3 0f 1e fa             endbr64
    11f4:       48 ba 2f 00 00 00 11    movabs $0x110000002f,%rdx
    11fb:       00 00 00
    11fe:       48 8d 0d 23 2b 00 00    lea    0x2b23(%rip),%rcx        # 3d28 <_ZN5sc_dt13sc_value_base17concat_clear_dataEb>
    1205:       48 63 f6                movslq %esi,%rsi
    1208:       48 89 f8                mov    %rdi,%rax
    120b:       48 89 57 10             mov    %rdx,0x10(%rdi)
    120f:       48 89 0f                mov    %rcx,(%rdi)
    1212:       48 89 77 08             mov    %rsi,0x8(%rdi)
    1216:       c3                      retq
    1217:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
    121e:       00 00

0000000000001220 <_Z3f64x>:
    1220:       f3 0f 1e fa             endbr64
    1224:       48 c1 e6 11             shl    $0x11,%rsi
    1228:       48 8d 0d f9 2a 00 00    lea    0x2af9(%rip),%rcx        # 3d28 <_ZN5sc_dt13sc_value_base17concat_clear_dataEb>
    122f:       48 89 f8                mov    %rdi,%rax
    1232:       48 ba 2f 00 00 00 11    movabs $0x110000002f,%rdx
    1239:       00 00 00
    123c:       48 c1 fe 11             sar    $0x11,%rsi
    1240:       48 89 57 10             mov    %rdx,0x10(%rdi)
    1244:       48 89 77 08             mov    %rsi,0x8(%rdi)
    1248:       48 89 0f                mov    %rcx,(%rdi)
    124b:       c3                      retq
    124c:       0f 1f 40 00             nopl   0x0(%rax)

Observe that the int to sc_int<47> conversion is still a single movslq, while the int64 to sc_int<47> conversion is shl+sar as in the original.

maehne commented 9 months ago

@AndrewGoodrich: Could you please have a look whether this is addressed by the new datatypes implementation in SystemC 3.0.0?

AndrewGoodrich commented 9 months ago

This issue is fixed in 3.0 by not using a left-shift right-shift scheme for sign propagation. For example, in sc_int_base:

void extend_sign()
    {
        if ( m_val & (1ull << (m_len-1)) ) {
            m_val |= (~UINT_ZERO << (m_len-1));
        }
        else {
            m_val &= ( ~UINT_ZERO >> m_ulen );
        }
    }
maehne commented 9 months ago

Thanks @AndrewGoodrich for the quick reply.

@lmailletcontoz: Then, we can close this PR as fixed in the SystemC 3.0.0 release.

lmailletcontoz commented 9 months ago

Fixed in the SystemC 3.0.0 release.