BYVoid / OpenCC

Conversion between Traditional and Simplified Chinese
https://opencc.byvoid.com/
Apache License 2.0
8.46k stars 982 forks source link

Controllable memory allocations in MarisaDict::NewFromFile(), which can cause DOS #817

Open morningbread opened 1 year ago

morningbread commented 1 year ago

Hi, i found a controllable memory allocations in MarisaDict::NewFromFile(), which can cause DOS.

I put the POC in the attachment, prove it like this. (compile open_dict with address sanitizer). ./opencc_dict -i poc -o tmp -f ocd2 -t text

Then, ASAN would catch the error:

==3569078==ERROR: AddressSanitizer: requested allocation size 0x2000000000008 (0x2000000001008 after adjustments for alignment, red zones etc.) exceeds maximum supported size of 0x10000000000 (thread T0)                                     
#0 0x4cd11d in operator new[](unsigned long, std::nothrow_t const&) /home/brian/src/llvm_releases/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:104:3                                         
#1 0x7eff1e183476 in marisa::grimoire::vector::Vector<unsigned int>::realloc(unsigned long) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/grimoire/vector/vector.h:230:9                                                              
#2 0x7eff1e183476 in marisa::grimoire::vector::Vector<unsigned int>::reserve(unsigned long) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/grimoire/vector/vector.h:96:5                                                               
#3 0x7eff1e183476 in marisa::grimoire::vector::Vector<unsigned int>::resize(unsigned long) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/grimoire/vector/vector.h:59:5    
#4 0x7eff1e183476 in marisa::grimoire::vector::Vector<unsigned int>::read_(marisa::grimoire::io::Reader&) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/grimoire/vector/vector.h:215:5
#5 0x7eff1e18126a in marisa::grimoire::vector::Vector<unsigned int>::read(marisa::grimoire::io::Reader&) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/grimoire/vector/vector.h:34:10
#6 0x7eff1e18126a in marisa::grimoire::vector::BitVector::read_(marisa::grimoire::io::Reader&) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/grimoire/vector/bit-vector.h:158:15
#7 0x7eff1e173c8a in marisa::grimoire::vector::BitVector::read(marisa::grimoire::io::Reader&) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/grimoire/vector/bit-vector.h:37:10
#8 0x7eff1e15a8b3 in marisa::grimoire::trie::LoudsTrie::read_(marisa::grimoire::io::Reader&) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/grimoire/trie/louds-trie.cc:562:10
#9 0x7eff1e159fb5 in marisa::grimoire::trie::LoudsTrie::read(marisa::grimoire::io::Reader&) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/grimoire/trie/louds-trie.cc:44:8
#10 0x7eff1e14b4d0 in marisa::TrieIO::fread(_IO_FILE*, marisa::Trie*) /home/coco/work/OpenCC/deps/marisa-0.2.6/lib/marisa/trie.cc:188:11
#11 0x7eff1e0ec9b4 in opencc::MarisaDict::NewFromFile(_IO_FILE*) /home/coco/work/OpenCC/src/MarisaDict.cpp:105:3
#12 0x7eff1e0c3854 in bool opencc::SerializableDict::TryLoadFromFile<opencc::MarisaDict>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::shared_ptr<opencc::MarisaDict>*) /home/coco/work/OpenCC/src/SerializableDict.hpp:62:40
#13 0x7eff1e0d87af in std::shared_ptr<opencc::MarisaDict> opencc::SerializableDict::NewFromFile<opencc::MarisaDict>(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/coco/work/OpenCC/src/SerializableDict.hpp:71:10
#14 0x7eff1e0d87af in LoadDictionary(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/coco/work/OpenCC/src/DictConverter.cpp:38:12
#15 0x7eff1e0d90f4 in opencc::ConvertDictionary(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) /home/coco/work/OpenCC/src/DictConverter.cpp:65:22
#16 0x4db5cf in main /home/coco/work/OpenCC/src/tools/DictConverter.cpp:48:5
#17 0x7eff1db1d082 in __libc_start_main /build/glibc-5hggjy/glibc-2.31/csu/../csu/libc-start.c:308:16

==3569078==HINT: if you don't care about these errors you may set allocator_may_return_null=1
SUMMARY: AddressSanitizer: allocation-size-too-big /home/brian/src/llvm_releases/llvm-project/llvm/utils/release/final/llvm-project/compiler-rt/lib/asan/asan_new_delete.cpp:104:3 in operator new[](unsigned long, std::nothrow_t const&)
==3569078==ABORTING

poc.zip