Closed PaulCarrick closed 12 years ago
@PaulCarrick Thank you very much for both using the gem, and taking the time to report! If the data is sensitive, can I ask you to email both your code an the sheet to me directly? Preferably in an encrypted archive, or at least a password protected archive.
I'll have a look and see if I can find out what's going on here.
Randy:
Thanks for responding so quickly.
I believe I have confirmed that the issue is indeed in the zip portion of the generated file and I have a work around although I'm not terribly fond of it. I have added code that unzips the file to a directory and then re-zips the file using zip. This has solved the issue for me for the time being but I really hate relying on having zip available and in a given location.
In relation to this issue. I suspect that some gem was updated and the issue is in my environment. In one environment it didn't generate the error but when I moved the code into the production environment it began to produce the problem. Also I was told that the problem didn't occur for a bit and the timing seems to suggest it may have been after a gem update.
This is a full dump of the gems installed: actionmailer (3.2.3, 3.2.2, 3.2.1, 3.1.3) actionpack (3.2.3, 3.2.2, 3.2.1, 3.1.3) activemodel (3.2.3, 3.2.2, 3.2.1, 3.1.3) activerecord (3.2.3, 3.2.2, 3.2.1, 3.1.3) activeresource (3.2.3, 3.2.2, 3.2.1, 3.1.3) activesupport (3.2.3, 3.2.2, 3.2.1, 3.1.3) acts_as_xlsx (1.0.5) arel (3.0.2, 3.0.0, 2.2.1) axlsx (1.1.1, 1.1.0) builder (3.0.0) bundler (1.1.3, 1.0.21) deep_merge (1.0.0) erubis (2.7.0) git (1.2.5) haml (3.1.4) hike (1.2.1) i18n (0.6.0) jeweler (1.8.3) journey (1.0.3, 1.0.1) json (1.6.6, 1.6.5) kgio (2.7.4, 2.7.2) mail (2.4.4, 2.4.1, 2.3.0) mime-types (1.18, 1.17.2) multi_json (1.3.1, 1.1.0, 1.0.4) nokogiri (1.5.2, 1.5.0) pg (0.13.2, 0.13.1, 0.12.2) polyglot (0.3.3) rack (1.4.1, 1.3.6) rack-cache (1.2, 1.1) rack-mount (0.8.3) rack-ssl (1.3.2) rack-test (0.6.1) rails (3.2.3, 3.2.2, 3.2.1, 3.1.3) railties (3.2.3, 3.2.2, 3.2.1, 3.1.3) raindrops (0.8.0) rake (0.9.2.2) rdoc (3.12) rmagick (2.13.1) ruby-ole (1.2.11.3) rubygems-update (1.8.22) rubyzip (0.9.7) spreadsheet (0.6.8) sprockets (2.4.0, 2.1.2, 2.0.3) thor (0.14.6) tilt (1.3.3) treetop (1.4.10) tzinfo (0.3.33, 0.3.32, 0.3.31) unicorn (4.2.1, 4.2.0) zip (2.0.2)
Attached is a zip file containing several items. I have managed to "sanitize" the data so it shouldn't be an issue.
bad-spreadsheet.xlsx: This is the spreadsheet as I was originally getting that causes the error in Excel.
repaired-spreadsheet.xlsx: This is the spreadsheet after it has been "repaired" in Excel.
workaround-spreadsheet.xlsx: This is the spreadsheet I am now producing by generating the file then unzipping and re-zipping it up.
render_report.rb: This is the main code. It's very large and still under development the section that should be of most interest to you is around lines 163-179. This is where the file is generated from the worksheet, Uncomment line 166 to see how the workaround works (it assumes a *nix environ ment with zip installed in /usr/bin). Executing this file should generate a spreadsheet. It normally runs in a rails environment but I added a test harness at the end so you can test it.
render_report_configuration.rb: This is a support file. It's only included so that you can run the main program if you want.
test.json: This is some test data used when the program is run.
pm_52.yml: This is a configuration file used by the program to generate the spreadsheet.
Thanks again for all your help with this. It is really appreciated.
Paul Carrick
Programmer Orcasnet.com E-Mail: paulc@orcasnet.com Web: http://www.orcasnet.comhttp://oni1/exchweb/bin/redir.asp?URL=http://www.orcasnet.com/
On Wed, Apr 18, 2012 at 5:01 PM, Randy Morgan < reply@reply.github.com
wrote:
@PaulCarrick Thank you very much for both using the gem, and taking the time to report! If the data is sensitive, can I ask you to email both your code an the sheet to me directly? Preferably in an encrypted archive, or at least a password protected archive.
I'll have a look and see if I can find out what's going on here.
Reply to this email directly or view it on GitHub: https://github.com/randym/axlsx/issues/81#issuecomment-5211936
After some digging it looks like this was an issue with the configuration on the testing server. Setting up a clean environment resolves the issue.
I have a hunch that rubyzip and zip gems, side by size is the issue as they both define module Zip.
Greetings @randym we are now experiencing this same issue. I am looking into it and will get back to you if I find the answer.
Greets @pmorton Thank you for reporting the issue.
Can I ask you to please post a listing of the gems and a bit of information about your environment? Unfortunately there are a few other gems out there besides rubyzip that define the Zip module and that has been the more common cause.
Here is the interesting part... If you use zipinfo to get a file listing on the one needing repair, you will see that there are only 9 files. Howerver if you unzip and and re-zip it and do a zipinfo you will see 15 files. Could it be that the index is not being created or closed properly.
Bad File
Downloads pmorton$ zipinfo needs-repair.xlsx
Archive: needs-repair.xlsx 4499 bytes 9 files
-rw-r--r-- 5.2 unx 577 t- defN 25-Jun-12 17:34 _rels/.rels
-rw-r--r-- 5.2 unx 486 t- defN 25-Jun-12 17:34 docProps/core.xml
-rw-r--r-- 5.2 unx 226 t- defN 25-Jun-12 17:34 docProps/app.xml
-rw-r--r-- 5.2 unx 407 t- defN 25-Jun-12 17:34 xl/_rels/workbook.xml.rels
-rw-r--r-- 5.2 unx 998 t- defN 25-Jun-12 17:34 [Content_Types].xml
-rw-r--r-- 5.2 unx 349 t- defN 25-Jun-12 17:34 xl/workbook.xml
-rw-r--r-- 5.2 unx 139 t- defN 25-Jun-12 17:34 xl/worksheets/_rels/sheet1.xml.rels
-rw-r--r-- 5.2 unx 3818 t- defN 25-Jun-12 17:34 xl/worksheets/sheet1.xml
9 files, 18203 bytes uncompressed, 3403 bytes compressed: 81.3%
Good File (Extracted XLSX and re-zipped with zip command line tool)
Downloads pmorton$ zipinfo repaired.xlsx
Archive: repaired.xlsx 5819 bytes 15 files
-rw-r--r-- 3.0 unx 998 tx defN 25-Jun-12 17:34 [Content_Types].xml
drwxr-xr-x 3.0 unx 0 bx stor 26-Jun-12 08:37 _rels/
-rw-r--r-- 3.0 unx 577 tx defN 25-Jun-12 17:34 _rels/.rels
drwxr-xr-x 3.0 unx 0 bx stor 26-Jun-12 08:37 docProps/
-rw-r--r-- 3.0 unx 226 tx defN 25-Jun-12 17:34 docProps/app.xml
-rw-r--r-- 3.0 unx 486 tx defN 25-Jun-12 17:34 docProps/core.xml
drwxr-xr-x 3.0 unx 0 bx stor 26-Jun-12 08:37 xl/
drwxr-xr-x 3.0 unx 0 bx stor 26-Jun-12 08:37 xl/_rels/
-rw-r--r-- 3.0 unx 407 tx defN 25-Jun-12 17:34 xl/_rels/workbook.xml.rels
-rw-r--r-- 3.0 unx 11203 tx defN 25-Jun-12 17:34 xl/styles.xml
-rw-r--r-- 3.0 unx 349 tx defN 25-Jun-12 17:34 xl/workbook.xml
drwxr-xr-x 3.0 unx 0 bx stor 26-Jun-12 08:37 xl/worksheets/
drwxr-xr-x 3.0 unx 0 bx stor 26-Jun-12 08:37 xl/worksheets/_rels/
-rw-r--r-- 3.0 unx 139 tx defN 25-Jun-12 17:34 xl/worksheets/_rels/sheet1.xml.rels
-rw-r--r-- 3.0 unx 3818 tx defN 25-Jun-12 17:34 xl/worksheets/sheet1.xml
15 files, 18203 bytes uncompressed, 3403 bytes compressed: 81.3%
Local Gems
*** LOCAL GEMS ***
actionmailer (3.2.6)
actionpack (3.2.6)
activemodel (3.2.6)
activerecord (3.2.6)
activerecord-sqlserver-adapter (3.2.4)
activeresource (3.2.6)
activesupport (3.2.6)
akami (1.1.0)
archive-tar-minitar (0.5.2)
arel (3.0.2)
aws-s3 (0.6.3)
axlsx (1.1.7)
bcrypt-ruby (3.0.1)
brakeman (1.6.2)
builder (3.0.0)
bundler (1.1.4)
cancan (1.6.7)
childprocess (0.3.2)
chunky_png (1.2.5)
coderay (1.0.6)
coffee-rails (3.2.2)
coffee-script (2.2.0)
coffee-script-source (1.3.3)
columnize (0.3.6)
compass (0.12.1)
compass-rails (1.0.2)
daemons (1.1.8)
debugger (1.1.4)
debugger-linecache (1.1.1)
debugger-ruby_core_source (1.1.3)
devise (2.1.0)
erubis (2.7.0)
eventmachine (0.12.10)
execjs (1.4.0)
fastercsv (1.5.5)
ffaker (1.14.0)
ffi (1.0.11)
foreman (0.47.0)
fssm (0.2.9)
guid (0.1.1)
gyoku (0.4.5)
haml (3.1.6)
highline (1.6.12)
hike (1.2.1)
httparty (0.8.3)
httpi (1.0.0)
i18n (0.6.0)
journey (1.0.3)
jquery-rails (2.0.2)
js-routes (0.8.1)
json (1.5.4)
json_pure (1.7.3)
kgio (2.7.4)
litmus (0.3.0)
litmus_mailer (0.1.0)
log4r (1.1.10)
lograge (0.0.4)
mail (2.4.4)
mail_view (1.0.1)
mailcatcher (0.5.6)
method_source (0.7.1)
mime-types (1.18)
multi_json (1.3.6)
multi_xml (0.5.1)
net-scp (1.0.4)
net-ssh (2.2.2)
newrelic_rpm (3.3.5)
nokogiri (1.5.4)
nori (1.1.0)
orm_adapter (0.0.7)
polyglot (0.3.3)
populator (1.0.0)
postmark (0.9.10)
pry (0.9.9.6)
rack (1.4.1)
rack-cache (1.2)
rack-protection (1.2.0)
rack-ssl (1.3.2)
rack-test (0.6.1)
rails (3.2.6)
rails_config (0.3.0)
railties (3.2.6)
raindrops (0.9.0)
rake (0.9.2.2)
rdoc (3.12)
redis (2.2.2)
redis-namespace (1.0.3)
resque (1.20.0)
resque-scheduler (2.0.0)
resque_mailer (2.1.0)
ruby2ruby (1.3.1)
ruby_parser (2.3.1)
rubyzip (0.9.8)
rufus-scheduler (2.0.16)
sass (3.1.19)
sass-rails (3.2.5)
savon (1.0.0)
sexp_processor (3.2.0)
simple_postmark (0.4.3)
sinatra (1.3.2)
skinny (0.2.0)
slop (2.4.4)
sprite-factory (1.5.1)
sprockets (2.1.3)
sqlite3 (1.3.6)
terminal-table (1.4.5)
thin (1.3.1)
thor (0.15.2)
tilt (1.3.3)
tiny_tds (0.5.1)
treetop (1.4.10)
tzinfo (0.3.33)
uglifier (1.2.4)
unicorn (4.3.1)
vagrant (1.0.3)
vegas (0.1.11)
warden (1.1.1)
wasabi (2.4.0)
xml-simple (1.1.1)
yaml_db (0.2.3)
After a bundle install --deplyoment, I greped the vendor gems for any definfition of module Zip. All definitions seem to be in rubyzip
grep "module Zip" * -r
ruby/1.9.1/gems/rubyzip-0.9.8/samples/zipfind.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/samples/zipfind.rb: module ZipFind
ruby/1.9.1/gems/rubyzip-0.9.8/samples/zipfind.rb: module ZipFindConsoleRunner
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/inflater.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/null_compressor.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/pass_thru_decompressor.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/compressor.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/null_input_stream.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip_entry_set.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip_central_directory.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip_output_stream.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/null_decompressor.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/pass_thru_compressor.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zipfilesystem.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zipfilesystem.rb: module ZipFileSystem
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip_entry.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/deflater.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip_file.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/decompressor.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip_extra_field.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/constants.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip_streamable_stream.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip_streamable_directory.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/zip_input_stream.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/settings.rb:module Zip
ruby/1.9.1/gems/rubyzip-0.9.8/lib/zip/dos_time.rb:module Zip
Thinking you may be on to something here. https://github.com/aussiegeek/rubyzip/issues/15
Ill have a look and see if there is someway to create the directory entries as well.
In the mean time, can you have a go at the last bit of that gist that is basically just repackaging the archive with directories added.
zf = Zip::ZipFile.new("5entry.zip")
buffer = Zip::ZipOutputStream.write_buffer do |out|
zf.entries.each do |e|
if e.ftype == :directory
out.put_next_entry(e.name)
else
out.put_next_entry(e.name)
out.write e.get_input_stream.read
end
end
end
File.open("6entry.zip", "wb") {|f| f.write(buffer.string) }
Locally, I've updated the serialization to sort and add directories in the zip file. (zipinfo output below)
Unfortunately, while it works fine before adding this is, it now spits out repair warnings. I'd be interested to hear if this breaks for me, but works for you. Can I ask you to fork master and make the following changes to package.rb
# Writes the package parts to a zip archive.
# @param [Zip::ZipOutputStream] zip
# @return [Zip::ZipOutputStream]
def write_parts(zip)
p = parts.sort { |a, b| a[:entry] <=> b[:entry] }
p.each do |part|
unless part[:doc].nil?
dir = File.dirname(part[:entry])
zip.put_next_entry("#{dir}/") unless dir == '.'
zip.put_next_entry(part[:entry])
entry = ['1.9.2', '1.9.3'].include?(RUBY_VERSION) ? part[:doc].force_encoding('BINARY') : part[:doc]
zip.puts(entry)
end
unless part[:path].nil?
zip.put_next_entry(part[:entry]);
# binread for 1.9.3
zip.write IO.respond_to?(:binread) ? IO.binread(part[:path]) : IO.read(part[:path])
end
end
zip
end
This is the output you should see from zipinfo after those changes. I would push this to master, but a few people out there are linking directly to git hub and I dont want to break their working apps.
Morgans-MacBook-Air:axlsx randym$ zipinfo test.xlsx
Archive: test.xlsx 3897 bytes 15 files
-rw-r--r-- 5.2 unx 998 t- defN 29-Jun-12 08:50 [Content_Types].xml
drwxr-xr-x 5.2 unx 0 t- defN 29-Jun-12 08:50 _rels/
-rw-r--r-- 5.2 unx 577 t- defN 29-Jun-12 08:50 _rels/.rels
drwxr-xr-x 5.2 unx 0 t- defN 29-Jun-12 08:50 docProps/
-rw-r--r-- 5.2 unx 226 t- defN 29-Jun-12 08:50 docProps/app.xml
-rw-r--r-- 5.2 unx 486 t- defN 29-Jun-12 08:50 docProps/core.xml
drwxr-xr-x 5.2 unx 0 t- defN 29-Jun-12 08:50 xl/_rels/
-rw-r--r-- 5.2 unx 407 t- defN 29-Jun-12 08:50 xl/_rels/workbook.xml.rels
drwxr-xr-x 5.2 unx 0 t- defN 29-Jun-12 08:50 xl/
-rw-r--r-- 5.2 unx 1295 t- defN 29-Jun-12 08:50 xl/styles.xml
-rw-r--r-- 5.2 unx 326 t- defN 29-Jun-12 08:50 xl/workbook.xml
drwxr-xr-x 5.2 unx 0 t- defN 29-Jun-12 08:50 xl/worksheets/_rels/
-rw-r--r-- 5.2 unx 139 t- defN 29-Jun-12 08:50 xl/worksheets/_rels/sheet1.xml.rels
drwxr-xr-x 5.2 unx 0 t- defN 29-Jun-12 08:50 xl/worksheets/
-rw-r--r-- 5.2 unx 430 t- defN 29-Jun-12 08:50 xl/worksheets/sheet1.xml
15 files, 4884 bytes uncompressed, 2185 bytes compressed: 55.3%
just noticed that xl/ is coming after xl/_rels/ in that list...
Randy - I think that we found the source of our issue. I still cannot explain the difference in the zipinfo. However we did find the cause of a couple of issues that we where having.
send_file temp.path, :filename => "#{filename}.xlsx", :type => "application/vnd.ms-excel"
``` This cause firefox to append and xls to the file which in turn cause excel to open the document as a binary excel file.
2. We used the example provided [here](http://axlsx.blogspot.com/2011_12_01_archive.html). We then found that the documents were always wanting to be repairs. After much diagnostics, I went last ditch and opened the file in a hex editor and found that "HTTP/500 Internal Server Error" was appended to the end of the zip file. As it would turn out, when used with a unicorn web server, the temp.unlink is causing the file to be cleaned up before the file can be fully sent (possibly the cause of the difference in the zipinfo results?). Unlinking prior to send_file completion makes no sense since the call is a blocking call.
What we did to resolve the issue was to not call unlink and let Tempfile take care of the cleanup when the tempfile falls out of scope. (Not sure if this is 1.9.1 only)
```ruby
begin
temp = Tempfile.new('tmpfile.xlsx')
package.serialize temp.path
temp.close
send_file temp.path, :filename => "#{filename}.xlsx", :type => 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
ensure
temp.close
end
So the long and short is I am not sure if we are chasing unicorns or if my above explanation really explains this issue.
I think you nailed it!
I was very surprised that excel would care about the directory listings in the zip file. (the spec is pretty clear in terms of what needs to be zipped.)
If you are interested in using temp files, and you have not done this already, you might want to have a look at changing
config.action_dispatch.x_sendfile_header = X-SendFile
to
changing config.action_dispatch.x_sendfile_header = X-Accel-Redirect
in production.rb
If my understanding of send_file in 3+ is correct, there is no guarantee that Tempfile won't remove that file before the streaming is complete even without the unlink call.
There is one more alternative that is now available in the axlsx library that lets you stream a StringIO object, which will also resolve your issue with temp files:
# Serialize your workbook to a StringIO instance
# @param [Boolean] confirm_valid Validate the package prior to serialization.
# @return [StringIO|Boolean] False if confirm_valid and validation errors exist. rewound string IO if not.
def to_stream(confirm_valid=false)
return false unless !confirm_valid || self.validate.empty?
zip = write_parts(Zip::ZipOutputStream.new("streamed", true))
stream = zip.close_buffer
stream.rewind
stream
end
The obvious downside with both TempFile and streaming here is that the request blocks so if you have a large number of users hitting your system, this can seriously impact your application's performance.
If you can manage the security/authentication and hosting it is always a better idea to save the file locally and use a background job or daemon to move the file off to a different server dedicated to hosting the files. As that server is delivering a hard asset, you wont end up with an empty file, and for large sheets, your application wont be blocking.
Hi to everyone who came from Google, like me, in the search for "Ошибка в части содержимого" or "the content cannot be read”. My workaround is cropping column values. Our testers filled so much text in a field that, logically, could not be so long (car number).
Hope this helps someone
Greetings:
Thank you for the excellent gem.
I am creating a spreadsheet using Axlsx. I am validing the contents and then serializing:
Everything appears to validate well. When I go to load the preadsheet under Excel 2010/2007 Windows 7/XP (I tried all variations) I get the error: "Excel found unreadable content in 'XXXXX.xlsx'. Do you want to recover the contents of this workbook? If you trust the source of this workbook, click Yes.". I click Yes and I get a popup with the following message: "Excel completed file level validation and repair. Some parts of this workbook may have been repaired or discarded.". I then view the log and these are the contents: `<?xml version="1.0" encoding="UTF-8" standalone="true"?> -
-
`
That was the entire contents of the log.
After I closed the popup. I looked at the spreadsheet and everything appeared correct. So the only issue is the warning. By the way OpenOffice has no trouble opening the same file.
In searching for a solution I came across issue #1 which seemed almost to match. You mentioned rubyzip. That got me to thinking so I tried something crazy. I copied the output .xlsx file and renamed it .zip. I then copied out the contents. I created a zip archive (in windows explorer) and copied the contents into it. I made no other changes. I renamed the archive to .xlsx and low and behold the file opened in Excel with no errors. So I'm thinking there may be an issue with the zip archive somehow.
Here is the information on the development platform: Platform: Ubuntu 11.10 (oneiric) Ruby: 1.9.2p290 Rails: 3.2.3 Axlsx: 1.1.1 (Also tested with 1.1.0 same issue). Rubyzip: 0.9.7
I can provide the files if you need them. I did't add them by default as they contain sensitive information.
Any help would be appreciated.