ruby-docx / docx

a ruby library/gem for interacting with .docx files
MIT License
431 stars 170 forks source link

Fuzzer + various crashes #127

Open bcoles opened 2 years ago

bcoles commented 2 years ago

Here's an extremely rudimentary naive fuzzer for docx :

```ruby #!/usr/bin/env ruby ################################################### # ----------------------------------------------- # # Fuzz docx Ruby gem with mutated DOCX files # # ----------------------------------------------- # # # # Each test case is written to 'fuzz.docx' in the # # current working directory. # # # # Crashes and the associated backtrace are saved # # in the 'crashes' directory in the current # # working directory. # # # ################################################### # ~ bcoles require 'date' require 'docx' require 'colorize' require 'fileutils' require 'timeout' require 'securerandom' VERBOSE = false OUTPUT_DIR = "#{Dir.pwd}/crashes".freeze # # Show usage # def usage puts 'Usage: ./fuzz.rb [FILE2] [FILE3] [...]' puts 'Example: ./fuzz.rb spec/fixtures/**.docx' exit 1 end # # Print status message # # @param [String] msg message to print # def print_status(msg = '') puts '[*] '.blue + msg if VERBOSE end # # Print progress messages # # @param [String] msg message to print # def print_good(msg = '') puts '[+] '.green + msg if VERBOSE end # # Print error message # # @param [String] msg message to print # def print_error(msg = '') puts '[-] '.red + msg end # # Setup environment # def setup FileUtils.mkdir_p OUTPUT_DIR unless File.directory? OUTPUT_DIR rescue => e print_error "Could not create output directory '#{OUTPUT_DIR}': #{e}" exit 1 end # # Generate a mutated DOCX file with a single mitated byte # # @param [Path] f path to DOCX file # def mutate_byte(f) data = IO.binread f position = SecureRandom.random_number data.size new_byte = SecureRandom.random_number 256 new_data = data.dup.tap { |s| s.setbyte(position, new_byte) } File.open(@fuzz_outfile, 'w') do |file| file.write new_data end end # # Generate a mutated DOCX file with multiple mutated bytes # # @param [Path] f path to DOCX file # def mutate_bytes(f) data = IO.binread f fuzz_factor = 200 num_writes = rand((data.size / fuzz_factor.to_f).ceil) + 1 new_data = data.dup num_writes.times do position = SecureRandom.random_number data.size new_byte = SecureRandom.random_number 256 new_data.tap { |stream| stream.setbyte position, new_byte } end File.open(@fuzz_outfile, 'w') do |file| file.write new_data end end # # Generate a mutated DOCX file with all integers replaced by '-1' # # @param [Path] f path to DOCX file # def clobber_integers(f) data = IO.binread f new_data = data.dup.gsub(/\d/, '-1') File.open(@fuzz_outfile, 'w') do |file| file.write new_data end end # # Generate a mutated DOCX file with all strings 3 characters or longer # replaced with 2000 'A' characters # # @param [Path] f path to DOCX file # def clobber_strings(f) data = IO.binread f new_data = data.dup.gsub(/[a-zA-Z]{3,}/, 'A' * 2000) File.open(@fuzz_outfile, 'w') do |file| file.write new_data end end # # Read a DOCX file # # @param [String] f path to DOCX file # def read(f) print_status "Processing '#{f}'" begin reader = Docx::Document.open(f) rescue => e if e.message == 'zlib error while inflating' print_status "Could not parse DOCX '#{f}': #{e.message}" return end if e.message == 'No such file or directory' print_status "Could not parse DOCX '#{f}': #{e.message}" return end raise end print_good 'Processing complete' print_status "Parsing '#{f}'" parse(reader) print_good 'Parsing complete' end # # Parse DOCX # def parse(reader) print_status 'Parsing DOCX...' print_status reader.document_properties print_status reader.paragraphs print_status reader.bookmarks print_status reader.to_xml print_status reader.tables print_status reader.font_size print_status reader.hyperlinks print_status reader.hyperlink_relationships print_status reader.to_s print_status reader.to_html print_status reader.stream print_status 'Parsing DOCX contents...' contents = '' reader.bookmarks.each_pair do |bookmark_name, bookmark_object| contents << bookmark_object.to_s end reader.tables.each do |table| table.rows.each do |row| row.cells.each do |cell| contents << cell.text end end end # puts contents if VERBOSE end # # Show summary of crashes # def summary puts puts "Complete! Crashes saved to '#{OUTPUT_DIR}'" puts puts `/usr/bin/head -n1 #{OUTPUT_DIR}/*.trace` if File.exist? '/usr/bin/head' end # # Report error message to STDOUT # and save fuzz test case and backtrace to OUTPUT_DIR # def report_crash(e) puts " - #{e.message}" puts e.backtrace.first fname = "#{DateTime.now.strftime('%Y%m%d%H%M%S%N')}_crash_#{rand(1000)}" FileUtils.mv @fuzz_outfile, "#{OUTPUT_DIR}/#{fname}.docx" File.open("#{OUTPUT_DIR}/#{fname}.docx.trace", 'w') do |file| file.write "#{e.message}\n#{e.backtrace.join "\n"}" end end # # Test docx with the mutated file # def test Timeout.timeout(@timeout) do read @fuzz_outfile end rescue SystemStackError => e report_crash e rescue Timeout::Error => e report_crash e rescue SyntaxError => e report_crash e rescue => e raise e unless e.backtrace.join("\n") =~ %r{docx} report_crash e end # # Generate random byte mutations and run test # # @param [String] f path to DOCX file # def fuzz_bytes(f) iterations = 1000 1.upto(iterations) do |i| print "\r#{(i * 100) / iterations} % (#{i} / #{iterations})" mutate_bytes f test end end # # Generate integer mutations and run tests # # @param [String] f path to DOCX file # def fuzz_integers(f) clobber_integers f test end # # Generate string mutations and run tests # # @param [String] f path to DOCX file # def fuzz_strings(f) clobber_strings f test end puts '-' * 60 puts '% Fuzzer for docx Ruby gem' puts '-' * 60 puts usage if ARGV[0].nil? setup @timeout = 15 @fuzz_outfile = 'fuzz.docx' trap 'SIGINT' do puts puts 'Caught interrupt. Exiting...' summary exit 130 end ARGV.each do |f| unless File.exist? f print_error "Could not find file '#{f}'" next end fuzz_integers f fuzz_strings f fuzz_bytes f puts '-' * 60 end summary ```

Here's the stack traces for the latest version on master using test data from ./spec/fixtures as input.

crashes.zip

Unique crash messages:

$ head -n 1 crashes/*.trace | fgrep -v "==>" | sort -u

1:33: FATAL: Unsupported encoding u
1:38: FATAL: Unsupported encoding DloF-8
1:38: FATAL: Unsupported encoding Ndlr-8
1:38: FATAL: Unsupported encoding orTF-8
1:38: FATAL: Unsupported encoding Uanc08
1:38: FATAL: Unsupported encoding UUnw48
1:41: FATAL: Unsupported encoding codinoF-8
ERROR: Undefined namespace prefix: //w:docDefaults//w:rPrDefault//w:rPr//w:sz
ERROR: Undefined namespace prefix: //w:document//w:body/w:p
ERROR: Undefined namespace prefix: //xmlns:Relationship[contains(@Type,'hyperlink')]
path name contains null byte
undefined method `close' for nil:NilClass
undefined method `value' for nil:NilClass
undefined method `xpath' for nil:NilClass
Unsupported compression method 100
Unsupported compression method 113
Unsupported compression method 116
Unsupported compression method 122
Unsupported compression method 124
Unsupported compression method 127
Unsupported compression method 12808
Unsupported compression method 138
Unsupported compression method 141
Unsupported compression method 143
Unsupported compression method 14344
Unsupported compression method 14856
Unsupported compression method 151
Unsupported compression method 154
Unsupported compression method 17160
Unsupported compression method 17416
Unsupported compression method 176
Unsupported compression method 178
Unsupported compression method 19976
Unsupported compression method 2
Unsupported compression method 207
Unsupported compression method 209
Unsupported compression method 222
Unsupported compression method 228
Unsupported compression method 236
Unsupported compression method 23816
Unsupported compression method 24072
Unsupported compression method 252
Unsupported compression method 2568
Unsupported compression method 26120
Unsupported compression method 27
Unsupported compression method 31
Unsupported compression method 33
Unsupported compression method 33288
Unsupported compression method 34568
Unsupported compression method 39688
Unsupported compression method 41992
Unsupported compression method 44040
Unsupported compression method 45320
Unsupported compression method 50696
Unsupported compression method 5384
Unsupported compression method 55560
Unsupported compression method 61448
Unsupported compression method 64776
Unsupported compression method 70
Unsupported compression method 776
Unsupported compression method 78
Unsupported compression method 79
Unsupported compression method 80
Unsupported compression method 85
Unsupported compression method 89
Unsupported compression method 90
Unsupported compression method 9992
zlib error while inflating

Several of these are from underlying libraries.

Most interesting are: