This was originally skipped in #361 because I was getting an unreachable code error. I found it was possible to avoid that by providing a type annotation for the mode variable of the state machine instead of hard coding it to :none.
Sadly, I can't do the type annotation with T.let() because sorbet isn't a runtime dependency of pdf-reader, but I can hack it in via a sig in the RBI file if I make the initial state part of the method params.
For this particular class at least, it did. I modified the fuzzer in #248 like this and ran it for ~15 mins. No exceptions were raised from the CMap class.
diff --git a/tools/fuzz.rb b/tools/fuzz.rb
index ab81840..e87fbc6 100755
--- a/tools/fuzz.rb
+++ b/tools/fuzz.rb
@@ -182,7 +182,7 @@ def parse(reader)
contents = ''
reader.pages.each do |page|
contents << page.fonts.to_s
- contents << page.text.force_encoding('utf-8')
+ contents << page.text #.force_encoding('utf-8')
contents << page.raw_content.force_encoding('utf-8')
end
# puts contents if VERBOSE
@@ -203,12 +203,14 @@ def summary
# and save fuzz test case and backtrace to OUTPUT_DIR
#
def report_crash(e)
- puts " - #{e.message}"
- puts e.backtrace.first
- fname = "#{DateTime.now.strftime('%Y%m%d%H%M%S%N')}_crash_#{rand(1000)}"
- FileUtils.mv @fuzz_outfile, "#{OUTPUT_DIR}/#{fname}.pdf"
- File.open("#{OUTPUT_DIR}/#{fname}.pdf.trace", 'w') do |file|
- file.write "#{e.message}\n#{e.backtrace.join "\n"}"
+ if e.backtrace.any? { |l| l.include?("cmap") }
+ puts " - #{e.message}"
+ puts e.backtrace.first
+ fname = "#{DateTime.now.strftime('%Y%m%d%H%M%S%N')}_crash_#{rand(1000)}"
+ FileUtils.mv @fuzz_outfile, "#{OUTPUT_DIR}/#{fname}.pdf"
+ File.open("#{OUTPUT_DIR}/#{fname}.pdf.trace", 'w') do |file|
+ file.write "#{e.message}\n#{e.backtrace.join "\n"}"
+ end
end
end
This was originally skipped in #361 because I was getting an unreachable code error. I found it was possible to avoid that by providing a type annotation for the
mode
variable of the state machine instead of hard coding it to:none
.Sadly, I can't do the type annotation with
T.let()
because sorbet isn't a runtime dependency of pdf-reader, but I can hack it in via a sig in the RBI file if I make the initial state part of the method params.I think this might be a sorbet bug: https://github.com/sorbet/sorbet/issues/4174#issuecomment-1014923965.
This was prompted by @bcoles highlighting some crashes via a fuzzer in https://github.com/yob/pdf-reader/pull/421#issuecomment-1012591423 - I was curious to see if fully typing the CMap class would avoid the fuzzer crashes in that class.
For this particular class at least, it did. I modified the fuzzer in #248 like this and ran it for ~15 mins. No exceptions were raised from the CMap class.