roo-rb / roo

Roo provides an interface to spreadsheets of several sorts.
MIT License
2.8k stars 501 forks source link

#<NoMethodError: undefined method `xpath' for nil:NilClass> #344

Open MartinTibo opened 8 years ago

MartinTibo commented 8 years ago

Hello everyone,

We are facing one issue when we want to import a ".xlsx" file (we can't share this one because it's confidential) with the 2.5.1 roo version, we don't know what's the problem.

The error : #<NoMethodError: undefined method xpath' fornil:NilClass>`

The error backtrace:

".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.5.1/lib/roo/excelx/sheet_doc.rb:194:in `extract_cells'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.5.1/lib/roo/excelx/sheet_doc.rb:18:in `cells'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.5.1/lib/roo/excelx/sheet.rb:18:in `cells'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.5.1/lib/roo/excelx/sheet.rb:22:in `present_cells'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.5.1/lib/roo/excelx/sheet.rb:58:in `last_row'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.5.1/lib/roo/excelx.rb:124:in `last_row'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.5.1/lib/roo/base.rb:361:in `each'",

This bug occur when we try to make a "each" on a sheet containing data :

input_file.sheet(sheet_name).each do |row|
    p row
end

We have localized the problem in the code of your gem , it is on line 178 of "sheet_doc.rb" file :

def expand_merged_ranges(cells)
  # Extract merged ranges from xml
  merges = {}
  doc.xpath('/worksheet/mergeCells/mergeCell').each do |mergecell_xml|   # It's this line 
    tl, br = mergecell_xml['ref'].split(/:/).map { |ref| ::Roo::Utils.ref_to_key(ref) }
    for row in tl[0]..br[0] do
      for col in tl[1]..br[1] do
        next if row == tl[0] && col == tl[1]
        merges[[row, col]] = tl
      end
    end
  end

the object "doc" seems to be nil. Have you ever encountered this problem ? What kind of problems can this be? Where the object "doc" is defined?

We thank you in advance for your help.

stevendaniels commented 8 years ago

The doc object is defined in the parent class Roo::Excelx::Extractor in lib/roo/excelx/extractor.rb. I'd guess that your temp path doesn't exist, which is causing the doc method to return nil.

Could use try using roo 2.4.x to see if that fixes the issue? If it does, please let me know ASAP.

MartinTibo commented 8 years ago

I tried with roo 2.4.0 and the issue is the same : #<NoMethodError: undefined methodxpath' for nil:NilClass>`

With this backtrace :

".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.4.0/lib/roo/excelx/sheet_doc.rb:187:in `extract_cells'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.4.0/lib/roo/excelx/sheet_doc.rb:18:in `cells'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.4.0/lib/roo/excelx/sheet.rb:18:in `cells'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.4.0/lib/roo/excelx/sheet.rb:22:in `present_cells'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.4.0/lib/roo/excelx/sheet.rb:58:in `last_row'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.4.0/lib/roo/excelx.rb:120:in `last_row'",
 ".../.rvm/gems/ruby-2.0.0-p247/gems/roo-2.4.0/lib/roo/base.rb:359:in `each'"

It's weird because it not happens every time. We have an xlsx with a format that does not change, only certain data/values ​​change. Sometimes it works, we can browse data/lines of a tab with a "each" function, and sometimes it returns this error. We believe that the problem can possibly come from references that some cells have. Have you any ideas ?

brunolarouche commented 8 years ago

For me, rollingback my gem to 2.3.2 version fixed the issue. I also confirm that on version 2.4.0, I still have the bug too.

MartinTibo commented 8 years ago

I made this modification and i still had the same error. Other suggestions ?

stevendaniels commented 7 years ago

I've looked into this issue and I can't duplicate the situation. Here's a sample gist that does not demnostrate the issue (https://gist.github.com/stevendaniels/02fc3ef2057b95b2fc02b10a29ce2053).

I also spent some time looking at the commits between version 2.3.2 and 2.4.0 and I cannot find any changes that would change this behavior. I still suspect this is related to the temp files that roo makes getting deleted by the system. Once that happens, the Roo::Excelx::Extractor#doc will return nil. The effect is a bit confusing, so I'm considering adding an error for this situation.

If someone provides a sample file that can demonstrate the issue, I can better understand this issue.