bitdefender / bddisasm

bddisasm is a fast, lightweight, x86/x64 instruction decoder. The project also features a fast, basic, x86/x64 instruction emulator, designed specifically to detect shellcode-like behavior.
Apache License 2.0
874 stars 112 forks source link

Completion of error handling #7

Closed elfring closed 3 years ago

elfring commented 4 years ago

Would you like to add more error handling for return values from functions like the following?

ianichitei commented 4 years ago

Hi.

This is currently low on our priority list, but we will gladly review a PR if you feel like this is something you want to improve.

elfring commented 4 years ago

How do you think about to improve static source code analysis also for this software?

ianichitei commented 4 years ago

That's a slightly different issue.

Adding some cppcheck or clang-tidy checks might be useful.

ianichitei commented 3 years ago

With #21 merged I think we can close this for now. As I mentioned in that PR, more checks can be added, but it is better to discuss those in specific issues.

elfring commented 3 years ago

I suggest to avoid ignorance of return values a bit more. Would you like to detect every error situation as early as possible?

ianichitei commented 3 years ago

Current checks are done only on bddisasm and bdshemu which should not suffer from this. disasmtool (and maybe disasmtool_lix ignore some return values), but I did not focus on those because those are just tools that integrate the core libraries. The only disabled checks for the libraries are unusedFunction and unusedStructMember. Running cppcheck with all checks enabled on disasmtool gives us the following errors:

$ cppcheck.exe --language=c --enable=all --suppress=missingIncludeSystem -I inc/ -I inc/bdshemu -I bddisasm/include disasmtool/
Checking disasmtool\disasmtool.c ...
disasmtool\disasmtool.c:878:13: portability: %d in format string (no. 1) requires 'int' but the argument type is 'DWORD {aka unsigned long}'. [invalidPrintfArgType_sint]
            printf("        Operand: %d, Acc:  %s,  Type: %10s, Size: %2d, RawSize: %2d, Encoding: %s", i,
            ^
disasmtool\disasmtool.c:1291:9: portability: %x in format string (no. 1) requires 'unsigned int' but the argument type is 'DWORD {aka unsigned long}'. [invalidPrintfArgType_uint]
        printf("ReadFile failed: GLA: %x\n", GetLastError());
        ^
disasmtool\disasmtool.c:1453:9: portability: %zu in format string (no. 1) requires 'size_t' but the argument type is 'SIZE_T {aka unsigned long long}'. [invalidPrintfArgType_uint]
        printf("Memory error: couldn't allocated %zu bytes!\n", fsize);
        ^
disasmtool\disasmtool.c:1463:9: portability: %zu in format string (no. 1) requires 'size_t' but the argument type is 'SIZE_T {aka unsigned long long}'. [invalidPrintfArgType_uint]
        printf("Memory error: couldn't allocated %zu bytes!\n", fsize);
        ^
disasmtool\disasmtool.c:1471:9: portability: %zu in format string (no. 1) requires 'size_t' but the argument type is 'SIZE_T {aka unsigned long long}'. [invalidPrintfArgType_uint]
        printf("Memory error: couldn't allocated %zu bytes!\n", fsize);
        ^
disasmtool\disasmtool.c:1583:13: portability: %x in format string (no. 2) requires 'unsigned int' but the argument type is 'DWORD {aka unsigned long}'. [invalidPrintfArgType_uint]
            printf("Could not open the file %s : 0x%08x\n", fNameDecoded, GetLastError());
            ^
disasmtool\disasmtool.c:1751:17: portability: %x in format string (no. 1) requires 'unsigned int *' but the argument type is 'DWORD * {aka unsigned long *}'. [invalidScanfArgType_int]
                sscanf_s(argv[i + 1], "%x", &offset);
                ^
disasmtool\disasmtool.c:1759:17: portability: %zx in format string (no. 1) requires 'size_t *' but the argument type is 'SIZE_T * {aka unsigned long long *}'. [invalidScanfArgType_int]
                sscanf_s(argv[i + 1], "%zx", &rip);
                ^
disasmtool\disasmtool.c:1889:13: portability: %x in format string (no. 2) requires 'unsigned int' but the argument type is 'DWORD {aka unsigned long}'. [invalidPrintfArgType_uint]
            printf("Couldn't open file '%s': 0x%08x\n", fname, GetLastError());
            ^
disasmtool\disasmtool.c:1898:13: portability: %x in format string (no. 2) requires 'unsigned int' but the argument type is 'DWORD {aka unsigned long}'. [invalidPrintfArgType_uint]
            printf("Couldn't create file mapping for '%s': 0x%08x\n", argv[1], GetLastError());
            ^
disasmtool\disasmtool.c:1907:13: portability: %x in format string (no. 2) requires 'unsigned int' but the argument type is 'DWORD {aka unsigned long}'. [invalidPrintfArgType_uint]
            printf("Couldn't map the view for '%s': 0x%08x\n", argv[1], GetLastError());
            ^
disasmtool\disasmtool.c:1506:70: style: Same value in both branches of ternary operator. [duplicateValueTernary]
    ctx.TibBase = Options->Mode == ND_CODE_32 ? ctx.Segments.Fs.Base : ctx.Segments.Gs.Base;
                                                                     ^
disasmtool\disasmtool.c:611:18: style: The scope of the variable 'idx' can be reduced. [variableScope]
    DWORD k = 0, idx = 0, i = 0;
                 ^
disasmtool\disasmtool.c:611:27: style: The scope of the variable 'i' can be reduced. [variableScope]
    DWORD k = 0, idx = 0, i = 0;
                          ^
disasmtool\disasmtool.c:1176:36: style: The scope of the variable 'istart' can be reduced. [variableScope]
    unsigned long long icount = 0, istart, iend, start, end, itotal = 0;
                                   ^
disasmtool\disasmtool.c:1176:44: style: The scope of the variable 'iend' can be reduced. [variableScope]
    unsigned long long icount = 0, istart, iend, start, end, itotal = 0;
                                           ^
disasmtool\disasmtool.c:1404:12: style: The scope of the variable 'hFile' can be reduced. [variableScope]
    HANDLE hFile;
           ^
disasmtool\disasmtool.c:1417:52: style: Argument 'sizeof(char)*decFileNameLength' to function malloc is always 23 [constArgument]
        fNameDecoded = (char *)malloc(sizeof(char) * decFileNameLength);
                                                   ^
disasmtool\disasmtool.c:1416:35: note: Assignment 'decFileNameLength=sizeof("hex_string_decoded.bin")', assigned value is 23
        decFileNameLength = sizeof("hex_string_decoded.bin");
                                  ^
disasmtool\disasmtool.c:1417:52: note: Argument 'sizeof(char)*decFileNameLength' to function malloc is always 23
        fNameDecoded = (char *)malloc(sizeof(char) * decFileNameLength);
                                                   ^
Checking disasmtool\disasmtool.c: BIG_ENDIAN...
Checking disasmtool\disasmtool.c: _MSC_VER...
disasmtool\disasmtool.c:77:0: style: The function 'nd_memset' is never used. [unusedFunction]

^
disasmtool\disasmtool.c:66:0: style: The function 'nd_vsnprintf_s' is never used. [unusedFunction]

^

Doing this for disasmtool_lix:

$ cppcheck --language=c++ --enable=all --suppress=missingIncludeSystem -I inc/ -I inc/bdshemu -I bddisasm/include -I disasmtool_lix/ disasmtool_lix
Checking disasmtool_lix/disasmtool.cpp ...
disasmtool_lix/external/argparse.h:123:5: style: Class 'Result' has a constructor with 1 argument that is not explicit. [noExplicitConstructor]
    Result(std::string err) noexcept : _error(true), _what(err) {}
    ^
disasmtool_lix/disasmtool.cpp:531:9: portability: %lu in format string (no. 4) requires 'unsigned long' but the argument type is 'size_t {aka unsigned long}'. [invalidPrintfArgType_uint]
        printf("Disassembled %zu instructions took %ld.%09ld seconds, %lu ns / instr.\n",
        ^
disasmtool_lix/external/argparse.h:325:14: style: The scope of the variable 'arg_len' can be reduced. [variableScope]
      size_t arg_len;
             ^
disasmtool_lix/external/argparse.h:123:24: performance: Function parameter 'err' should be passed by const reference. [passedByValue]
    Result(std::string err) noexcept : _error(true), _what(err) {}
                       ^
Checking disasmtool_lix/disasmtool.cpp: BIG_ENDIAN...
Checking disasmtool_lix/disasmtool.cpp: HAS_RAPIDJSON...
Checking disasmtool_lix/disasmtool.cpp: _MSC_VER...
Checking disasmtool_lix/disasmtool.cpp: _WIN32...
Checking disasmtool_lix/disasmtool.cpp: __unix__...
1/3 files checked 17% done
Checking disasmtool_lix/dumpers.cpp ...
Checking disasmtool_lix/dumpers.cpp: BIG_ENDIAN...
Checking disasmtool_lix/dumpers.cpp: HAS_RAPIDJSON...
Checking disasmtool_lix/dumpers.cpp: _MSC_VER...
2/3 files checked 76% done
Checking disasmtool_lix/rapidjson.cpp ...
Checking disasmtool_lix/rapidjson.cpp: BIG_ENDIAN...
Checking disasmtool_lix/rapidjson.cpp: HAS_RAPIDJSON...
Checking disasmtool_lix/rapidjson.cpp: _MSC_VER...
3/3 files checked 100% done
disasmtool_lix/rapidjson.cpp:65:0: style: The function '_write_int' is never used. [unusedFunction]

^
disasmtool_lix/rapidjson.cpp:91:0: style: The function '_write_int64' is never used. [unusedFunction]

^
disasmtool_lix/disasmtool.cpp:450:0: style: The function 'disassemble_one' is never used. [unusedFunction]

^
disasmtool_lix/rapidjson.cpp:1171:0: style: The function 'json_to_string' is never used. [unusedFunction]

^
disasmtool_lix/disasmtool.cpp:76:0: style: The function 'nd_memset' is never used. [unusedFunction]

^
disasmtool_lix/disasmtool.cpp:70:0: style: The function 'nd_vsnprintf_s' is never used. [unusedFunction]

^
nofile:0:0: information: Cppcheck cannot find all the include files (use --check-config for details) [missingInclude]

Checking for ignored return values should be done by the compiler: -Wunused-result should be on by default for gcc, but I might be wrong (however, adding it to the bddisasm and bdshemu makefiles does not trigger any warnings).

elfring commented 3 years ago

I suggest to take another look at the possible detection of remaining questionable source code places.

ianichitei commented 3 years ago

-Wunused-result should be on by default for gcc, but I might be wrong

This is incorrect, as the docs put it:

-Wno-unused-result Do not warn if a caller of a function marked with attribute warn_unused_result (see Function Attributes) does not use its return value. The default is -Wunused-result.

But, as I said, explicitly adding -Wunused-result to bddisasm and bdshemu does not trigger any warnings.

More checks can be added, I agree (there's always something extra that can be checked).

For the core libraries there is also support for fuzzing (which includes support for sanitizers like ASAN and MSAN).

elfring commented 3 years ago

Is it interesting anyhow that source code places could be reconsidered also according to two function names which I mentioned for this issue initially?

ianichitei commented 3 years ago

I suppose you're asking about this.

Would you like to add more error handling for return values from functions like the following?

The fread problem is in the fuzzer harness. That's not critical, and there's no reliable way of treating it in my opinion. I can retry to read parts of the file a few times and then just give up, but either way I'm going to ignore a fuzz input. The printf problem is not really a problem. Again, there's no reliable way of reporting a printf error to the user, and I think that most projects don't try to treat printf errors in any way.

elfring commented 3 years ago

I found just a few source code places for further development considerations in this software.

ianichitei commented 3 years ago
  • Would you like to recheck the capabilities of other known analysis tools accordingly?

I'm considering clang tidy, but at the moment I don't have any patches for that. If you want to make one I'll gladly review it. Our internal build system also uses sonarqube, but I haven't looked into using that with GitHub yet.

  • I propose that you reconsider the views for consistent error reporting.

Fixing the fread and printf issues does not add any value to the project. The first one is in a tool that is not used in production or by a person, but by an automated process. The second one is just a printf error - how do you propose treating these?

elfring commented 3 years ago

Fixing the fread and printf issues does not add any value to the project.

I present other software development opinions for this aspect.

The second one is just a printf error - how do you propose treating these?

There are return values (which should not be ignored) provided by such functions. Function call failures can be reported in some ways as usual.