jsonkenl / xlsxir

Xlsx parser for the Elixir language.
MIT License
215 stars 85 forks source link

Test failure in Erlang 27.1: test get_cell returns correct content even with rich text (XlsxirTest) #130

Closed nathany-copia closed 1 month ago

nathany-copia commented 1 month ago
  1) test get_cell returns correct content even with rich text (XlsxirTest)
     test/xlsxir_test.exs:70
     ** (MatchError) no match of right hand side value: {:error, {:EXIT, {:function_clause, [{:zip, :update_zip64, [{:local_file_header, 20, 0, 8, 34596, 19146, 4211273224, 243, 392, :undefined, :undefined, :undefined, 0, 0, 15, 20, :undefined}, <<0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0>>], [file: ~c"zip.erl", line: 1942]}, {:lists, :foldl, 3, [file: ~c"lists.erl", line: 2146]}, {:zip, :get_z_file, 11, [file: ~c"zip.erl", line: 2252]}, {:zip, :get_z_files, 5, [file: ~c"zip.erl", line: 2223]}, {:zip, :do_unzip, 2, [file: ~c"zip.erl", line: 425]}, {:zip, :unzip, 2, [file: ~c"zip.erl", line: 410]}, {Xlsxir.Unzip, :extract_xml, 3, [file: ~c"lib/xlsxir/unzip.ex", line: 145]}, {Xlsxir.XlsxFile, :extract_all_xml_files, 2, [file: ~c"lib/xlsxir/xlsx_file.ex", line: 234]}, {Xlsxir.XlsxFile, :initialize, 2, [file: ~c"lib/xlsxir/xlsx_file.ex", line: 63]}, {Xlsxir, :extract, 4, [file: ~c"lib/xlsxir.ex", line: 87]}, {XlsxirTest, :"test get_cell returns correct content even with rich text", 1, [file: ~c"test/xlsxir_test.exs", line: 71]}, {ExUnit.Runner, :exec_test, 2, [file: ~c"lib/ex_unit/runner.ex", line: 485]}, {:timer, :tc, 2, [file: ~c"timer.erl", line: 590]}, {ExUnit.Runner, :"-spawn_test_monitor/4-fun-1-", 6, [file: ~c"lib/ex_unit/runner.ex", line: 407]}]}}}
     code: {:ok, pid} = extract(rb_path(), 0)
     stacktrace:
       test/xlsxir_test.exs:71: (test)
nathany-copia commented 1 month ago

[file: ~c"lib/xlsxir/unzip.ex", line: 154] seems to be where it fails?

nathany-copia commented 1 month ago
{
  :error,
  {:EXIT, {:function_clause, [
    {:zip, :update_zip64, [
      {:local_file_header, 20, 0, 8, 34596, 19146, 4211273224, 243, 392, :undefined, :undefined, :undefined, 0, 0, 15, 20, :undefined},
      <<0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0>>],
      [file: ~c"zip.erl", line: 1942]}, {:lists, :foldl, 3,
      [file: ~c"lists.erl", line: 2146]}, {:zip, :get_z_file, 11,
      [file: ~c"zip.erl", line: 2252]}, {:zip, :get_z_files, 5,
      [file: ~c"zip.erl", line: 2223]}, {:zip, :do_unzip, 2, [file: ~c"zip.erl", line: 425]}, {:zip, :unzip, 2,
      [file: ~c"zip.erl", line: 410]}, {Xlsxir.Unzip, :extract_xml, 3,
      [file: ~c"lib/xlsxir/unzip.ex", line: 154]},
      {Xlsxir.XlsxFile, :extract_all_xml_files, 2, [file: ~c"lib/xlsxir/xlsx_file.ex", line: 234]},
      {Xlsxir.XlsxFile, :initialize, 2, [file: ~c"lib/xlsxir/xlsx_file.ex", line: 63]},
      {Xlsxir, :extract, 4, [file: ~c"lib/xlsxir.ex", line: 86]},
      {XlsxirTest, :"test get_cell returns correct content even with rich text", 1,
      [file: ~c"test/xlsxir_test.exs", line: 71]}, {ExUnit.Runner, :exec_test, 2,
      [file: ~c"lib/ex_unit/runner.ex", line: 485]}, {:timer, :tc, 2,
      [file: ~c"timer.erl", line: 590]},
      {ExUnit.Runner, :"-spawn_test_monitor/4-fun-1-", 6, [file: ~c"lib/ex_unit/runner.ex", line: 407]}]}}
}

So some sort of error unzipping, but I haven't figured out why yet. Some debugging suggests that all the inputs are charlists, so that's not it.

extract_from_zip(path, file_list, :memory)

nathany-copia commented 1 month ago

https://www.erlang.org/patches/otp-27.1

nathany-copia commented 1 month ago

I suspect this is actually a bug in Erlang

:zip.extract(~c"./test/test_data/red_black.xlsx", [{:file_list, [~c"xl/worksheets/sheet1.xml", ~c"xl/styles.xml", ~c"xl/sharedStrings.xml", ~c"xl/workbook.xml"]}, :memory])
{:error,
 {:EXIT,
  {:function_clause,
   [
     {:zip, :update_zip64,
      [
        {:local_file_header, 20, 0, 8, 34596, 19146, 4211273224, 243, 392,
         :undefined, :undefined, :undefined, 0, 0, 15, 20, :undefined},
        <<0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0>>
      ], [file: ~c"zip.erl", line: 1942]},
     {:lists, :foldl, 3, [file: ~c"lists.erl", line: 2146]},
     {:zip, :get_z_file, 11, [file: ~c"zip.erl", line: 2252]},
     {:zip, :get_z_files, 5, [file: ~c"zip.erl", line: 2223]},
     {:zip, :do_unzip, 2, [file: ~c"zip.erl", line: 425]},
     {:zip, :unzip, 2, [file: ~c"zip.erl", line: 410]},
     {:elixir, :eval_external_handler, 3, [file: ~c"src/elixir.erl", line: 386]},
     {:erl_eval, :do_apply, 7, [file: ~c"erl_eval.erl", line: 904]}
   ]}}}
iex(1)> :zip.extract(~c"./test/test_data/red_black.xlsx")
{:error,
 {:EXIT,
  {:function_clause,
   [
     {:zip, :update_zip64,
      [
        {:local_file_header, 20, 0, 8, 34596, 19146, 3819151005, 228, 587,
         :undefined, :undefined, :undefined, 0, 0, 11, 20, :undefined},
        <<0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0>>
      ], [file: ~c"zip.erl", line: 1942]},
     {:lists, :foldl, 3, [file: ~c"lists.erl", line: 2146]},
     {:zip, :get_z_file, 11, [file: ~c"zip.erl", line: 2252]},
     {:zip, :get_z_files, 5, [file: ~c"zip.erl", line: 2223]},
     {:zip, :do_unzip, 2, [file: ~c"zip.erl", line: 425]},
     {:zip, :unzip, 2, [file: ~c"zip.erl", line: 410]},
     {:elixir, :eval_external_handler, 3, [file: ~c"src/elixir.erl", line: 386]},
     {:erl_eval, :do_apply, 7, [file: ~c"erl_eval.erl", line: 904]}
   ]}}}
1> zip:extract("./test/test_data/red_black.xlsx").
{error,{'EXIT',{function_clause,[{zip,update_zip64,
                                      [{local_file_header,20,0,8,34596,19146,3819151005,228,587,
                                                          undefined,undefined,undefined,0,0,11,20,undefined},
                                       <<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0>>],
                                      [{file,"zip.erl"},{line,1942}]},
                                 {lists,foldl,3,[{file,"lists.erl"},{line,2146}]},
                                 {zip,get_z_file,11,[{file,"zip.erl"},{line,2252}]},
                                 {zip,get_z_files,5,[{file,"zip.erl"},{line,2223}]},
                                 {zip,do_unzip,2,[{file,"zip.erl"},{line,425}]},
                                 {zip,unzip,2,[{file,"zip.erl"},{line,410}]},
                                 {erl_eval,do_apply,7,[{file,"erl_eval.erl"},{line,904}]},
                                 {shell,exprs,7,[{file,"shell.erl"},{line,893}]}]}}}

where as the test file suceeds

zip:extract("./test/test_data/test.xlsx").
{ok,["[Content_Types].xml","_rels/.rels",
     "xl/_rels/workbook.xml.rels","xl/workbook.xml",
     "xl/worksheets/sheet4.xml","xl/sharedStrings.xml",
     "xl/worksheets/sheet3.xml","xl/worksheets/sheet2.xml",
     "xl/worksheets/_rels/sheet1.xml.rels",
     "xl/worksheets/_rels/sheet6.xml.rels",
     "xl/worksheets/_rels/sheet9.xml.rels","xl/styles.xml",
     "xl/worksheets/sheet1.xml","xl/worksheets/sheet11.xml",
     "xl/worksheets/sheet8.xml","xl/worksheets/sheet7.xml",
     "xl/worksheets/sheet6.xml","xl/worksheets/sheet5.xml",
     "xl/worksheets/sheet9.xml","xl/theme/theme1.xml",
     "xl/worksheets/sheet10.xml","docProps/app.xml",
     "xl/printerSettings/printerSettings1.bin",
     "xl/printerSettings/printerSettings2.bin",
     "xl/printerSettings/printerSettings3.bin",
     "xl/calcChain.xml",
     [...]]}

unzip has no problem

unzip -d red_black/ test/test_data/red_black.xlsx 
Archive:  test/test_data/red_black.xlsx
  inflating: red_black/_rels/.rels   
  inflating: red_black/docProps/core.xml  
  inflating: red_black/docProps/app.xml  
  inflating: red_black/xl/workbook.xml  
  inflating: red_black/xl/_rels/workbook.xml.rels  
  inflating: red_black/xl/theme/theme1.xml  
  inflating: red_black/xl/worksheets/sheet1.xml  
  inflating: red_black/xl/sharedStrings.xml  
  inflating: red_black/xl/styles.xml  
  inflating: red_black/[Content_Types].xml  
nathany-copia commented 1 month ago

https://github.com/erlang/otp/issues/8872

nathany-copia commented 1 month ago

If I open test.xlsx in Apple Numbers 14.2 and export it to a new Excel file, that file will also fail to open, whereas test.xslx does unzip.

So it may be something with how Apple Numbers is saving the Excel file that Erlang 27.1 doesn't handle. However, I don't know how red_black.xlsx was created or if it's failing for the same reason.

❯ erl
Erlang/OTP 27 [erts-15.1] [source] [64-bit] [smp:10:10] [ds:10:10:10] [async-threads:1] [jit] [dtrace]

Eshell V15.1 (press Ctrl+G to abort, type help(). for help)
1> zip:extract("test-numbers.xlsx").
{error,{'EXIT',{function_clause,[{zip,update_zip64,
                                      [{local_file_header,20,0,8,23607,22846,2891291556,219,571,
                                                          undefined,undefined,undefined,0,0,11,20,undefined},
                                       <<0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0>>],
                                      [{file,"zip.erl"},{line,1942}]},
                                 {lists,foldl,3,[{file,"lists.erl"},{line,2146}]},
                                 {zip,get_z_file,11,[{file,"zip.erl"},{line,2252}]},
                                 {zip,get_z_files,5,[{file,"zip.erl"},{line,2223}]},
                                 {zip,do_unzip,2,[{file,"zip.erl"},{line,425}]},
                                 {zip,unzip,2,[{file,"zip.erl"},{line,410}]},
                                 {erl_eval,do_apply,7,[{file,"erl_eval.erl"},{line,904}]},
                                 {shell,exprs,7,[{file,"shell.erl"},{line,893}]}]}}}
nathany-copia commented 1 month ago

It looks like this is fixed in Erlang OTP-27.1.1