asciidoctor / asciidoctor-epub3

:blue_book: Asciidoctor EPUB3 is a set of Asciidoctor extensions for converting AsciiDoc to EPUB3
https://asciidoctor.org
MIT License
217 stars 68 forks source link

minimising output file size #80

Closed ParaplegicRacehorse closed 1 year ago

ParaplegicRacehorse commented 7 years ago

Can the completed XHTML and CSS be passed through minify/tidy procedures before creating the archive? Likewise, can fonts be subset instead of embedding the whole thing? Images, too. Epub and KF8 are already small compared to PDF, but it's somewhat impolite to include more than what's needed in the archive.

Should be pretty easy to achieve?

for each (stylesdir/*.css)
  cssnano
for each (*.xhtml)
  html-minifier
for each (images/*.png)
  optipng
for each (images/*.jpg)
  imagemagik yada yada
mojavelinux commented 7 years ago

kindlegen actually does this already, afaik. So this would be more for the native EPUB output. Does seem like a good idea, though.

slonopotamus commented 4 years ago

Unlike web stuff, EPUB3 is a compressed format, so gains from compression are not going to be that great. And we have higher penalty for using non-Ruby software because we cannot install it automatically and would have to ask each user to install it on their own. I'd say that current book sizes are "good enough".

slonopotamus commented 1 year ago

Starting with c39d2159a8c329e2e2179a12c6cd7f4b39898f74, we output css files in compressed form (no comments and redundant newlines).

slonopotamus commented 1 year ago

I've tried minimizing HTML using


      def postprocess_xhtml content
        if @format == :kf8
          # TODO: convert regular expressions to constants
          content = content
            .gsub(/<img([^>]+) style="width: (\d\d)%;"/, '<img\1 style="width: \2%; height: \2%;"')
            .gsub(/<script type="text\/javascript">.*?<\/script>\n?/m, '')
        end

        doc = Nokogiri::XML.parse(content)
        doc.search('//comment()').remove
        doc.search('//text()[normalize-space()=""]').remove
        result = doc.to_xml(save_with: Nokogiri::XML::Node::SaveOptions::AS_XML)
        result.to_ios
      end

But... That resulted in 707KB -> 706KB total size for git-as-svn doc book and breakage of <pre> formatting. I don't think it is worth the effort and CPU cycles.

slonopotamus commented 1 year ago

I'd like to claim this issue is resolved.