Closed alexey-yanchenko closed 7 years ago
May be..
xlsx = Xlsxtream::Workbook.new(response.stream)
While xlsxtream is streaming in the sense that it writes out data as it is generated and never keeps all content in memory, the underlying IO still needs to be seekable, because the XLSX file format uses ZIP, which requires seeking to write the header information.
It is technically possible to write non-standard ZIP files that work around this issues by omitting certain information in the headers, but it is likely that Excel and other tools won't read the files or say that they are corrupt and need to be repaired.
The exact problem with strictly forward seeking is that each file in the ZIP is preceded by a header that contains the size and checksum for the file.
Because we generate data on the fly, we do not know the size or checksum ahead of time, so this info needs to be written by seeking back in the IO after the file is finished.
For now the best way to work around the issue is to write to a Tempfile and then send that file to the client as soon as all the data has been written (xlsx.close
takes care of writing the ZIP central directory).
Note that there's some discussion over at WeTransfer/zip_tricks#21 for implementing an alternate xlsxtream ZIP IO which could do true streaming by writing incomplete ZIP entry headers.
@alexey-yanchenko Please try out xlsxstream 2.0.0, it switches to zip_tricks fro ZIP compression and supports streaming to non-seekable IO-like objects.
You don't need to inherit from StringIO
, the following MyIO
should suffice:
class MyIO
def initialize(st)
@st = st
end
def <<(data)
@st.write(data)
end
def close
@st.close
end
end
Looks great :fire:
@felixbuenemann I tested xlsxstream 2.0.0 and it works as expected. Downloaded file can be normaly opened in MS Excel and LibreOffice. Thank you very much for done work :+1:
Yes, zip_tricks uses a zip feature to write the crc and size after instead of in front of the file which I did not know about, so it should work fine.
Just to be sure: is it really streaming in the mentioned example?
Wouldn't a
response.headers['Transfer-Encoding'] = 'chunked'
be required as well because it would be generated on the fly but delivered at once anyway?
Hello! Thank you, this works great :)
How do I change the filename downloading it this way?
@bceppi try to set the header at your controller like this:
response.headers['Content-Disposition'] = %(attachment; filename="some-file-name.csv")
For those that don't want to define a whole new class to alias 1 method, you can do this:
stream = response.stream
stream.define_singleton_method(:<<) { |value| write(value) }
Read error: #<Zlib::BufError: buffer error>
I am getting this error when I copy and paste OP's top comment. Any ideas what I am doing wrong?
@KevinColemanInc
*** Exception Zlib::BufError in Rack response body object (buffer error)
[...]
Disconnecting client with error: error parsing app response chunked encoding: unexpected end-of-stream
This error happens when Rack::Deflater
is present in the middleware. I can only think (without going deeper into the library) that streaming xlsx file conflicts with this middleware (probably because xlsx itself is a zip file). Maybe unexpected end-of-stream
has something to do with this.
I suggest that you should remove the Rack::Deflater
middleware and delegate the deflater to the webserver in production like Nginx or Apache.
@KevinColemanInc @Looooong Have you tried setting response.headers['Content-Encoding'] = 'identity'
to keep Rack::Deflater
from trying to compress the response?
See: https://github.com/rack/rack/blob/master/lib/rack/deflater.rb#L111
@felixbuenemann I am not actively working on a project that uses this gem, so unfortunately I don't have the time to investigate this on my own :-/. This was from so long ago, I think the fix from Looooong worked for me.
I just checked myself, it works if you set:
response.headers['Cache-Control'] = 'no-transform'
The encoding check in Rack::Deflater
is negated, so it checks if Content-Encoding
is not identity, so we can't use that.
Btw. I debugged the problem and I think it's a bug in Rack::Deflater
when a part of the response body enumerable returns an empty string:
gzip = ::Zlib::GzipWriter.new(StringIO.new)
=> #<Zlib::GzipWriter:0x00000000092bf670>
gzip.write ""
=> 0
gzip.flush
Zlib::BufError: buffer error
So they should make sure they write at least 1 byte before trying to flush.
The reason the body part might be empty is that Zlib controls the flushing of the compressed data, as it sees fit to get a good balance between compression and memory usage (the default Zlib window size is 32 KB).
There's an existing issue https://github.com/rack/rack/issues/959 for the problem and I've just proposed a fix in https://github.com/rack/rack/pull/1388.
I have a Rails app. Users have to be able to download large xlsx files. I'm trying to use xlsxtream gem with ActionController::Live.
Here is my controller's action:
Here is MyIO implementation:
I get an error:
File can be downloaded but it's corrupted. Is it possible to use this gem this way?