ihedvall / mdflib

Implementation of the ASAM MDF data file.
https://ihedvall.github.io/mdflib/
MIT License
57 stars 25 forks source link

Heap buffer overflow on `GetByteArrayValue` function #88

Closed g0ku704 closed 1 day ago

g0ku704 commented 1 week ago

Hi,

The GetByteArrayValue function https://github.com/ihedvall/mdflib/blob/adee6fc499d2c1357fa55d798cdbad92c06317be/mdflib/src/cn4block.cpp#L692 does not check maximum allocation and caused heap buffer overflow.

The sample code tested: https://ihedvall.github.io/mdflib/mdfreader.html

Address sanitizer output:

==201131==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x61300000116e at pc 0x56394a8acf94 bp 0x7ffd15d462d0 sp 0x7ffd15d45a98
READ of size 96 at 0x61300000116e thread T0
    #0 0x56394a8acf93 in memcpy (/mdflib/mdflibrary_example/src/build/mdfreader+0x74f93) (BuildId: a659090b6c4bd16d)
    #1 0x7fbd4e096e97 in mdf::detail::Cn4Block::GetByteArrayValue(std::vector<unsigned char, std::allocator<unsigned char> > const&, std::vector<unsigned char, std::allocator<unsigned char> >&) const /mdflib/mdflib/src/cn5block.cpp
    #2 0x7fbd4e108a8b in bool mdf::IChannel::GetChannelValue<std::vector<unsigned char, std::allocator<unsigned char> > >(std::vector<unsigned char, std::allocator<unsigned char> > const&, std::vector<unsigned char, std::allocator<unsigned char> >&, unsigned long) const /mdflib/mdflib/src/ichannel.cpp:507:15
    #3 0x7fbd4dfbd084 in mdf::detail::ChannelObserver<std::vector<unsigned char, std::allocator<unsigned char> > >::OnSample(unsigned long, unsigned long, std::vector<unsigned char, std::allocator<unsigned char> > const&) /mdflib/mdflib/src/channelobserver.h:147:30
    #4 0x7fbd4e16669d in mdf::IDataGroup::NotifySampleObservers(unsigned long, unsigned long, std::vector<unsigned char, std::allocator<unsigned char> > const&) const /mdflib/mdflib/src/idatagroup.cpp:39:17
    #5 0x7fbd4e16669d in mdf::detail::ReadCache::ParseRecord() /mdflib/mdflib/src/readcache.cpp:94:23
    #6 0x7fbd4e06f637 in mdf::detail::Dg4Block::ReadData(_IO_FILE*) /mdflib/mdflib/src/dg4block.cpp:274:21
    #7 0x7fbd4dfe70cc in mdf::MdfReader::ReadData(mdf::IDataGroup&) /mdflib/mdflib/src/mdfreader.cpp:396:11
    #8 0x56394a95779f in main /mdflib/mdflibrary_example/src/fuzz_test.cpp:42:16
    #9 0x7fbd4d9a3d8f in __libc_start_call_main csu/../sysdeps/nptl/libc_start_call_main.h:58:17
    #10 0x7fbd4d9a3e3f in __libc_start_main csu/../csu/libc-start.c:392:3
    #11 0x56394a892d44 in _start (/mdflib/mdflibrary_example/src/build/mdfreader+0x5ad44) (BuildId: a659090b6c4bd16d)

0x61300000116e is located 0 bytes to the right of 366-byte region [0x613000001000,0x61300000116e)
allocated by thread T0 here:
    #0 0x56394a9509dd in operator new(unsigned long) (/mdflib/mdflibrary_example/src/build/mdfreader+0x1189dd) (BuildId: a659090b6c4bd16d)
    #1 0x7fbd4e1663b1 in std::__new_allocator<unsigned char>::allocate(unsigned long, void const*) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/new_allocator.h:137:27
    #2 0x7fbd4e1663b1 in std::allocator_traits<std::allocator<unsigned char> >::allocate(std::allocator<unsigned char>&, unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/alloc_traits.h:464:20
    #3 0x7fbd4e1663b1 in std::_Vector_base<unsigned char, std::allocator<unsigned char> >::_M_allocate(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_vector.h:378:20
    #4 0x7fbd4e1663b1 in std::_Vector_base<unsigned char, std::allocator<unsigned char> >::_M_create_storage(unsigned long) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_vector.h:395:33
    #5 0x7fbd4e1663b1 in std::_Vector_base<unsigned char, std::allocator<unsigned char> >::_Vector_base(unsigned long, std::allocator<unsigned char> const&) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_vector.h:332:9
    #6 0x7fbd4e1663b1 in std::vector<unsigned char, std::allocator<unsigned char> >::vector(unsigned long, unsigned char const&, std::allocator<unsigned char> const&) /usr/lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/stl_vector.h:566:9
    #7 0x7fbd4e1663b1 in mdf::detail::ReadCache::ParseRecord() /mdflib/mdflib/src/readcache.cpp:90:30

SUMMARY: AddressSanitizer: heap-buffer-overflow (/mdflib/mdflibrary_example/src/build/mdfreader+0x74f93) (BuildId: a659090b6c4bd16d) in memcpy
Shadow bytes around the buggy address:
  0x0c267fff81d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c267fff81e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c267fff81f0: 00 00 fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c267fff8210: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c267fff8220: 00 00 00 00 00 00 00 00 00 00 00 00 00[06]fa fa
  0x0c267fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8260: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c267fff8270: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==201131==ABORTING

Proof of concept input: crashfile.zip

ihedvall commented 1 week ago

Thanks for finding a bug. There are some issues with the file (crashfile.mf4) but at the same time, the file includes all evil MDF types and conversions. I have checked it with the Vectors MDF Validator and my MDF File Viewer. None of them crash so I need to test your exact code. I assume it is the basic example code. I'm running Windows also so it may take some day(s) to figure out the problem.

Preliminary, the file have some issue in the second DG block with the composite CN block (struct_channel_0..7) and the nested structure CN block. It's the last 2 channels.

The Vector MDF Validator detects 13 errors and 64 warnings in the file.

g0ku704 commented 1 week ago

Sure, yes I had to slightly modify the example code. Here it is:

#include <iostream>
#include <mdf/mdfreader.h>
#include <mdf/ichannelgroup.h>
#include <mdf/idatagroup.h>

using namespace mdf;

int main(int argc, char* argv[]) {
    if (argc < 2) {
        std::cerr << "Usage: " << argv[0] << " <path_to_mdf_file>" << std::endl;
        return 1;
    }

    const char* file_path = argv[1];
    MdfReader reader(file_path);  // Open the specified file

    // Read all blocks but not the raw data and attachments.
    // This reads in the block information into memory.
    reader.ReadEverythingButData();

    const auto* mdf_file = reader.GetFile(); // Get the file interface.
    DataGroupList dg_list;                   // Get all measurements.
    mdf_file->DataGroups(dg_list);

    // In this example, we read in all sample data and fetch all values.
    for (auto* dg4 : dg_list) {
        // Subscribers holds the sample data for a channel.
        // You should normally only subscribe on some channels.
        // We need a list to hold them.
        ChannelObserverList subscriber_list;
        const auto cg_list = dg4->ChannelGroups();
        for (const auto* cg4 : cg_list ) {
            const auto cn_list = cg4->Channels();
            for (const auto* cn4 : cn_list) {
                // Create a subscriber and add it to the temporary list
                auto sub = CreateChannelObserver(*dg4, *cg4, *cn4);
                subscriber_list.push_back(std::move(sub));
            }
        }

        // Now it is time to read in all samples
        reader.ReadData(*dg4); // Read raw data from file
        double channel_value = 0.0; // Channel value (no scaling)
        double eng_value = 0.0; // Engineering value
        for (auto& obs : subscriber_list) {
            for (size_t sample = 0; sample < obs->NofSamples(); ++sample) {
                const auto channel_valid = obs->GetChannelValue(sample, channel_value);
                const auto eng_valid = obs->GetEngValue(sample, eng_value);
                // You should do something with data here
            }
        }

        // Not needed in this example as we delete the subscribers,
        // but it is good practise to remove samples data from memory
        // when it is no longer needed.
        //dg4->ResetSample();
    }
    reader.Close(); // Close the file

    return 0;
}
ihedvall commented 1 week ago

I might have found the error. The level22 channel have invalid offsets causing overrun when reading. I have added some range test to harden the code so it's now returning non-valid values which is a fair result.

I cannot reproduce your error in my environment but if you filter out the level22 channel subscription from your test code, the error should disappear.

The algebraic (formula) conversion is currently not supported. If you need support for that conversion method, please let me know.

I just need to check if the value range to value conversion is correct. It should work so I need to check this. It's currently returning invalid values.

g0ku704 commented 1 week ago

Thanks, in case if you need to reproduce the AddressSanitizer crash above, you can try compile the sample code above with -fsanitize=address, that should give the same error in the issue description. If, not then it means they are fixed.

ihedvall commented 1 week ago

I have made some changes regarding the buffer overrun. Tested on Windows.

ihedvall commented 1 day ago

I have added a fuzzytest.sln and verified that the heap buffer overflow problem is fixed. The file have an invalid index causing the problem. The fix is to set all values invalid for that channel but OK for the remaining channels.