Closed ParaplegicRacehorse closed 1 year ago
kindlegen actually does this already, afaik. So this would be more for the native EPUB output. Does seem like a good idea, though.
Unlike web stuff, EPUB3 is a compressed format, so gains from compression are not going to be that great. And we have higher penalty for using non-Ruby software because we cannot install it automatically and would have to ask each user to install it on their own. I'd say that current book sizes are "good enough".
Starting with c39d2159a8c329e2e2179a12c6cd7f4b39898f74, we output css files in compressed form (no comments and redundant newlines).
I've tried minimizing HTML using
def postprocess_xhtml content
if @format == :kf8
# TODO: convert regular expressions to constants
content = content
.gsub(/<img([^>]+) style="width: (\d\d)%;"/, '<img\1 style="width: \2%; height: \2%;"')
.gsub(/<script type="text\/javascript">.*?<\/script>\n?/m, '')
end
doc = Nokogiri::XML.parse(content)
doc.search('//comment()').remove
doc.search('//text()[normalize-space()=""]').remove
result = doc.to_xml(save_with: Nokogiri::XML::Node::SaveOptions::AS_XML)
result.to_ios
end
But... That resulted in 707KB -> 706KB total size for git-as-svn doc book and breakage of <pre>
formatting. I don't think it is worth the effort and CPU cycles.
I'd like to claim this issue is resolved.
Can the completed XHTML and CSS be passed through minify/tidy procedures before creating the archive? Likewise, can fonts be subset instead of embedding the whole thing? Images, too. Epub and KF8 are already small compared to PDF, but it's somewhat impolite to include more than what's needed in the archive.
Should be pretty easy to achieve?