pythonicrubyist / creek

Ruby library for parsing large Excel files.
http://rubygems.org/gems/creek
MIT License
388 stars 109 forks source link

Intermittent segfault on Nokogiri 1.13.6 -> 1.13.7 when processing Tempfile #105

Closed bf4 closed 1 year ago

bf4 commented 2 years ago

I've isolated this to creek (2.5.3, rails 7.0.3, cruby 3.0.3) but haven't figured out exactly why..

we store some xlsx files inline as bytea columns and load them into the Creek gem when processing

      fh = new_file_handler(file)
      book = load_file(fh)

where new_file_handler is

    fh = Tempfile.new(["creek-file", ".xlsx"], encoding: file.encoding)
    fh.write(file)
    fh.close
    fh

and load_file is

    path = fh.path
    Creek::Book.new(path, check_file_extension: false)

I thought the problem might be the fh.close before reading the file via path could change the behavior with the new nokogiri code, and the resolution would be to have an ensure block with book&.close when all done.

I tried that and I tried swapping out fh.write(file) with File.binwrite(fh.path, file) like Creek does for remote files, but that didn't resolve the seg faults. Only rolling back Nokogiri seems to have resolved them.

In any case, leaving this issue here in case any one else notices it.

d1ceward commented 2 years ago

I have identified the same problem on the Nokogiri::XML::Reader block from sheet.rb since the 1.13.7 update of Nokogiri.

I think it's from Nokogiri because ironically their last change note for 1.13.7 talks about solving a possible segfault on XML::Node with low-level ruby and garbage collector.

My current solution is the same, rollback nokogiri with :

gem 'nokogiri', '= 1.13.6'

Interesting parts of my segfault backtrace

./home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/nokogiri-1.13.7-x86_64-linux/lib/nokogiri/xml/reader.rb:90: [BUG] Segmentation fault at 0x0000000100000004
ruby 3.1.2p20 (2022-04-12 revision 4491bb740a) [x86_64-linux]

-- Control frame information -----------------------------------------------
c:0073 p:---- s:0371 e:000370 CFUNC  :attribute_nodes
c:0072 p:0003 s:0367 e:000366 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/nokogiri-1.13.7-x86_64-linux/lib/nokogiri/xml/reader.rb:90
c:0071 p:0207 s:0361 e:000360 BLOCK  /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/bundler/gems/creek-d63c47fca129/lib/creek/sheet.rb:121
c:0070 p:0010 s:0356 e:000355 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/nokogiri-1.13.7-x86_64-linux/lib/nokogiri/xml/reader.rb:102
c:0069 p:0024 s:0351 e:000350 BLOCK  /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/bundler/gems/creek-d63c47fca129/lib/creek/sheet.rb:100
c:0068 p:0218 s:0347 e:000346 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/rubyzip-2.3.2/lib/zip/entry.rb:540
c:0067 p:0011 s:0339 e:000338 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/rubyzip-2.3.2/lib/zip/file.rb:259
c:0066 p:0014 s:0333 e:000332 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/rubyzip-2.3.2/lib/zip/filesystem.rb:580
c:0065 p:0085 s:0327 e:000326 METHOD /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/rubyzip-2.3.2/lib/zip/filesystem.rb:245
c:0064 p:0032 s:0319 e:000318 BLOCK  /home/d1ceward/.asdf/installs/ruby/3.1.2/lib/ruby/gems/3.1.0/bundler/gems/creek-d63c47fca129/lib/creek/sheet.rb:99 [FINISH]
c:0063 p:---- s:0310 E:001130 CFUNC  :each
c:0062 p:---- s:0307 E:001ad0 CFUNC  :each_with_index
flavorjones commented 2 years ago

:eyes: Nokogiri maintainer here, I'll try to reproduce this. Any help you can provide would be great.

flavorjones commented 2 years ago

I could really use a reliable repro if either of you can provide one. Thanks.

flavorjones commented 2 years ago

Please open an issue upstream at https://github.com/sparklemotion/nokogiri with the information requested about your system and any kind of a repro you might have.

I know nothing about Creek so any help you can provide will help us diagnose what's going on faster.

bf4 commented 2 years ago

@flavorjones Thanks for looking ✨ I'll share what I can. I haven't been able to reproduce it outside of the test suite yet. I have a test which triggers is .. some times.. but at least it's a specific test

flavorjones commented 2 years ago

:+1: even a "sometimes" repro is good enough with most memory-related issues, because I can always run it in a loop.

flavorjones commented 2 years ago

Please follow along upstream at https://github.com/sparklemotion/nokogiri/issues/2598

flavorjones commented 2 years ago

Side note: I have a PR up that adds creek to the set of downstream gems that Nokogiri integration-tests against in CI.

See https://github.com/sparklemotion/nokogiri/runs/7454887315?check_suite_focus=true for an example.

flavorjones commented 1 year ago

FWIW this is addressed in Nokogiri v1.13.8 and later, and can probably be closed.

bf4 commented 1 year ago

Closing as resolved by new release