cisco-system-traffic-generator / trex-core

trex-core site
https://trex-tgn.cisco.com/
Other
1.32k stars 463 forks source link

[STL] TRex crashes when using max uint64_t value in Field Engine #1066

Open DTran0 opened 1 year ago

DTran0 commented 1 year ago

Hi,

TRex crashes when I try to send packet with Field Engine instructions where I use max uint64_t value (18446744073709551615 or $2^{64}-1$ ). However, TRex works fine when I use $2^{64}-2$ or lower. Documentation of STLVmFlowVar doesn't mention this limitation and uint32_t doesn't have this problem.

TRex version: 3.04 Used NICs: Mellanox ConnectX-5 TRex command: ./t-rex-64 -i --stl (so trex console can connect) TRex log:

Error: signal 8:

*** traceback follows ***

1       0x560051932863 ./_t-rex-64(+0x1ff863) [0x560051932863]
2       0x7f2ed3133d40 /lib64/libpthread.so.0(+0x12d40) [0x7f2ed3133d40]
3       0x560051b4c5db TrexRpcCmdAddStream::parse_vm_instr_flow_var(Json::Value const&, std::unique_ptr<TrexStream, std::default_delete<TrexStream> >&, Json::Value&) + 2139
4       0x560051b5061a TrexRpcCmdAddStream::parse_vm(Json::Value const&, std::unique_ptr<TrexStream, std::default_delete<TrexStream> >&, Json::Value&) + 4202
5       0x560051b5116b TrexRpcCmdAddStream::_run(Json::Value const&, Json::Value&) + 1211
6       0x560051a857a1 TrexRpcCommand::run(Json::Value const&, Json::Value&) + 81
7       0x560051a81304 JsonRpcMethod::_execute(Json::Value&) + 52
8       0x560051a7ea63 TrexJsonRpcV2ParsedObject::execute(Json::Value&) + 131
9       0x560051a7cf9d TrexRpcServerReqRes::process_request_raw(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) + 541
10      0x560051a7d77e TrexRpcServerReqRes::process_zipped_request(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) + 462
11      0x560051a7db8d TrexRpcServerReqRes::handle_request(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) + 525
12      0x560051a7e18b TrexRpcServerReqRes::_rpc_thread_cb_int() + 1339
13      0x560051a7e8bb TrexRpcServerReqRes::_rpc_thread_cb() + 11
14      0x7f2ed2a3d27f so/x86_64/libstdc++.so.6(+0xba27f) [0x7f2ed2a3d27f]
15      0x7f2ed31291da /lib64/libpthread.so.0(+0x81da) [0x7f2ed31291da]
16      0x7f2ed205ce73 clone + 67

*** addr2line information follows ***

??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0
??:0

./t-rex-64: line 106: 647660 Floating point exception(core dumped) ./_$(basename $0) $INPUT_ARGS $EXTRA_INPUT_ARGS
Killing Scapy server... Scapy server is killed

To reproduce crash, I am using following code (inspired by scripts in stl/ directory) and load it via trex console (start -f path_to_script):

from trex_stl_lib.api import *
import argparse

class STLS1(object):
    def create_stream(self):
        pkt = Ether() / IPv6()

        vm = STLScVmRaw(
            [
                STLVmFlowVar("x", min_value=0, max_value=2**64 - 1, size=8),  # doesn't crash when I use 2**64 - 2
                STLVmWrFlowVar(fv_name="x", pkt_offset="IPv6.src", offset_fixup=8),
            ]
        )

        return STLStream(
            packet=STLPktBuilder(pkt=pkt, vm=vm),
            mode=STLTXSingleBurst(pps=1, total_pkts=1),
        )

    def get_streams(self, tunables, **kwargs):
        parser = argparse.ArgumentParser(
            description="Argparser for {}".format(os.path.basename(__file__)),
            formatter_class=argparse.ArgumentDefaultsHelpFormatter,
        )

        args = parser.parse_args(tunables)
        return [self.create_stream()]

def register():
    return STLS1()
DTran0 commented 4 months ago

I think I found the issue:

https://github.com/cisco-system-traffic-generator/trex-core/blob/61fb7aead7063d98bf16cd9392e85021193d5584/src/stx/stl/trex_stl_stream_vm.h#L1431-L1433

https://github.com/cisco-system-traffic-generator/trex-core/blob/61fb7aead7063d98bf16cd9392e85021193d5584/src/stx/stl/trex_stl_stream_vm.h#L1471

Due to way get_range() computes result, uint64_t overflows to zero when max_value=2**64 - 1 and min_value=0 (as uint64_t can't hold 2**64). Some computations in file use division by get_range() value. This causes division by zero error (or Floating point exception).

I checked this by changing min_value to 1. Then it works and TRex doesn't crash.

You could cap max value to prevent overflow. But then result of get_range() will be off by one for min/max values provided above. It is probably better solution (when it will be documented in API) than crashing. But I am not familiar with code enough to know if there are any unwanted consequences to this solution.