wpsharks / html-compressor

HTML Compressor. Combines and compresses CSS/JS/HTML code.
https://websharks.github.io/html-compressor/
GNU General Public License v3.0
38 stars 9 forks source link

HTML-Compressor

Combines & compresses CSS/JS/HTML code.

<?php
require_once 'html-compressor.phar';
$html_compressor = new WebSharks\HtmlCompressor\Core();
ob_start(array($html_compressor, 'compress'));

Installation Instructions (Two Options)

  1. As a Composer Dependency

    {
      "require": {
          "websharks/html-compressor": "dev-master"
      }
    }
  2. Or, Download the PHAR Binary See: https://github.com/websharks/html-compressor/releases

Where do I get the PHAR file?

A PHAR binary is made available for each official release. See: releases.

Why did we create the HTML Compressor?

The HTML Compressor class was developed because all of us here at WebSharks™ are growing tired of seeing WordPress installations out-in-the-wild that are running many different plugins; where each plugin may add a new set of CSS/JS files. This creates a slow-loading site, even if a page caching plugin is active.

For example, if you look at the HTML source code for most sites powered by a publishing platform such as WordPress (or audit one in a web developer console), you will find a complete mess like this...

<link rel="stylesheet" href="https://github.com/wpsharks/html-compressor/blob/dev/theme.css" type="text/css" />
<link rel="stylesheet" href="https://github.com/wpsharks/html-compressor/blob/dev/child-theme.css" type="text/css" />
<link rel="stylesheet" href="https://github.com/wpsharks/html-compressor/blob/dev/theme-variation.css" type="text/css" />
<link rel="stylesheet" href="https://github.com/wpsharks/html-compressor/blob/dev/plugin1.css" type="text/css" />
<link rel="stylesheet" href="https://github.com/wpsharks/html-compressor/blob/dev/plugin2.css" type="text/css" />
<link rel="stylesheet" href="https://github.com/wpsharks/html-compressor/blob/dev/plugin3.css" type="text/css" />
<link rel="stylesheet" href="https://github.com/wpsharks/html-compressor/blob/dev/plugin4.css" type="text/css" />
<link rel="stylesheet" href="https://github.com/wpsharks/html-compressor/blob/dev/plugin5.css" type="text/css" />
... and on, and on, and on ...

<script type="text/javascript" src="https://github.com/wpsharks/html-compressor/raw/dev/jquery.js"></script>
<script type="text/javascript" src="https://github.com/wpsharks/html-compressor/raw/dev/jquery-migrate.min.js"></script>
<script type="text/javascript" src="https://github.com/wpsharks/html-compressor/raw/dev/tabs.min.js"></script>
<script type="text/javascript" src="https://github.com/wpsharks/html-compressor/raw/dev/custom.js"></script>
<script type="text/javascript" src="https://github.com/wpsharks/html-compressor/raw/dev/plugin1.js"></script>
<script type="text/javascript" src="https://github.com/wpsharks/html-compressor/raw/dev/plugin2.js"></script>
<script type="text/javascript" src="https://github.com/wpsharks/html-compressor/raw/dev/plugin3.js"></script>
<script type="text/javascript" src="https://github.com/wpsharks/html-compressor/raw/dev/plugin4.js"></script>
<script type="text/javascript" src="https://github.com/wpsharks/html-compressor/raw/dev/plugin5.js"></script>
... and on, and on, and on ...

↑ The Problem Here?

Instead of a single CSS and/or JS file (i.e., one or two HTTP connections); the browser needs to make several requests; and it needs to download each of these resources separately. This is not a problem that impacts WordPress alone, we see this issue across many publishing platforms where plugins are brought into the mix.

Ideally, your publishing platform (or theme) would minimize the number of external resources that it depends on by consolidating those external resources (i.e., CSS/JS files) into just one or two files; and then compress them too. However, not all themes do this. In fact, this is not always possible (even when a theme/plugin developer is aware of the issue).

For instance, if a theme/plugin developer is working within a set of PHP framework standards (e.g., doing things "the WordPress way"), the end result may not always be optimized in an ideal fashion. We know first-hand that this really bugs developers. Experienced developers don't create a mess by choice, it's just how the framework pulls everything together that can sometimes produce a mess. Also, when a site owner adds plugins to the mix later; where the publishing platform (or theme) is being supplemented by CSS/JS files that are plugin-specific—this is where things can really get crazy; e.g., a new CSS and/or JS file for each plugin.

Solution, the WebSharks™ HTML Compressor!

The WebSharks™ HTML Compressor works as an additional layer of functionality that can come in after your publishing platform pieces everything together. The WebSharks™ HTML Compressor analyzes each page of your site in real-time; i.e., as it's being loaded; inspecting each line of HTML code.

CSS/JS files are combined (where possible) and compressed (where possible); then it can optimize the HTML code and any inline JavaScript/CSS too. The goal is to speed things up for your visitors and to reduce the number of HTTP connections that your server processes.

Step-by-Step (Detailed Explanation)

All of these compression options are enabled by default, but you can modify this behavior as you see fit. Toward the bottom of this file you will find a list of all possible configuration options.

1. The HTML Compressor starts by inspecting the <head> and <body> of the HTML document. An attempt is made to recursively combine all CSS resources (including inline styles, and all remote resources too) into a single CSS file. If compress_css_code is enabled (on by default), the code in this single file is also compressed (i.e., extra whitespace is removed, hex color codes are optimized, etc, etc).

A few NOTES regarding step 1.

2. Next, we inspect the <head> of the HTML document. An attempt is made to combine all JS resources in the <head> into a single JS file. If compress_js_code is enabled (on by default), the code in this single file is also compressed (i.e., extra whitespace is removed, variable names are optimized, etc, etc).

3. Next, we inspect a special area of the source code that can be flagged for compression by wrapping a section with <!--footer-scripts--><!--footer-scripts-->. This flagging is only necessary if you have scripts that you intentionally place in the footer. If the HTML Compressor finds a <!--footer-scripts--><!--footer-scripts--> section; an attempt is made to combine all JS resources into a single JS file. If compress_js_code is enabled (on by default), the code in this single file is also compressed (i.e., extra whitespace is removed, variable names are optimized, etc, etc).

A few NOTES regarding steps 2 and 3.

4. Next, we look at the <body> for any inline <script> tags. While it is not possible to consolidate inline JS; if compress_inline_js_code is enabled (on by default) an attempt is made to compress the JavaScript code in these inline code snippets to reduce the amount of overhead they might add.

5. Last, we compress the HTML code itself (i.e., extra whitespace is removed). Care is taken to preserve special tags where raw formatting is important; but you should end up with a much smaller HTML file; and the external resources it depends on will have certainly be reduced to a bare minimum.


Some Usage Examples

1. HTML Compressor As An Output Buffer

This code snippet should be processed BEFORE any other output occurs.

<?php
require_once 'html-compressor.phar';
$html_compressor = new WebSharks\HtmlCompressor\Core();
ob_start(array($html_compressor, 'compress'));

TIP: The php.ini directive auto_prepend_file is a nice/clean way to integrate the HTML Compressor. You could create a file as seen in this example and specify that as the auto_prepend_file to enable compression of every HTML file that you serve; noting that the HTML Compressor will simply pass any non-HTML code through it's buffer without compressing it. The HTML Compressor only attempts to compress data which contains a closing </html> tag.

IMPORTANT NOTE: One thing to keep in mind is that the WebSharks™ HTML Compressor works best when it's integrated together with a page caching plugin like ZenCache for WordPress; or another page caching plugin that you might prefer. Why use a page caching plugin? The HTML Compressor can be used on any site powered by PHP; but ideally you would cache the optimized HTML that it outputs, thereby removing the need for the HTML Compressor to analyze every single request. Of course, you can analyze every single request if you want to (and the HTML Compressor has a cache of it's own to help keep things sane), but it's always better to store (cache) the compressed HTML output by this class. This will reduce server load and make your site even faster.

FAQ: Is html-compressor.phar the only file that I need? Yes. The other files that you see in the GitHub repo are already compressed into the PHAR file. The only file you need is the html-compressor.phar. A PHAR binary is made available for each official release. See: releases.

2. HTML Compressor as an Output Buffer (w/ Options)

This just demonstrates how to specify an array of options.

<?php
require_once 'html-compressor.phar';
$html_compressor_options = array(

    'css_exclusions' => array(),
    'js_exclusions' => array('.php?'),
    'uri_exclusions' => array(),

    'cache_expiration_time' => '14 days',
    'cache_dir_public' => '/var/www/public_html/htmlc/cache/public',
    'cache_dir_url_public' => 'http://example.com/htmlc/cache/public',
    'cache_dir_private' => '/var/www/public_html/htmlc/cache/private',

    'compress_combine_head_body_css' => TRUE,
    'compress_combine_head_js' => TRUE,
    'compress_combine_footer_js' => TRUE,
    'compress_inline_js_code' => TRUE,
    'compress_css_code' => TRUE,
    'compress_js_code' => TRUE,
    'compress_html_code' => TRUE,

    'benchmark' => FALSE,
    'product_title' => 'HTML Compressor',
    'vendor_css_prefixes' => array('moz','webkit','khtml','ms','o')
);
$html_compressor = new WebSharks\HtmlCompressor\Core($html_compressor_options);
ob_start(array($html_compressor, 'compress'));

3. HTML Compressor on raw HTML Code

This demonstrates how to run the compressor against arbitrary HTML code.

<?php
require_once 'html-compressor.phar';
$html_compressor_options = array(

    'css_exclusions' => array(),
    'js_exclusions' => array('.php?'),
    'uri_exclusions' => array(),

    'cache_expiration_time' => '14 days',
    'cache_dir_public' => '/var/www/public_html/htmlc/cache/public',
    'cache_dir_url_public' => 'http://example.com/htmlc/cache/public',
    'cache_dir_private' => '/var/www/public_html/htmlc/cache/private',

    'current_url_scheme' => 'http',
    'current_url_host' => 'www.example.com',
    'current_url_uri' => '/raw/file/test.html?one=1&two=2',

    'compress_combine_head_body_css' => TRUE,
    'compress_combine_head_js' => TRUE,
    'compress_combine_footer_js' => TRUE,
    'compress_inline_js_code' => TRUE,
    'compress_css_code' => TRUE,
    'compress_js_code' => TRUE,
    'compress_html_code' => TRUE,

    'benchmark' => FALSE,
    'product_title' => 'HTML Compressor',
    'vendor_css_prefixes' => array('moz','webkit','khtml','ms','o')
);
$html = '<html> ... </html>';
$html_compressor = new WebSharks\HtmlCompressor\Core($html_compressor_options);
$html = $html_compressor->compress($html);

Class Constructor Options

e.g., new WebSharks\HtmlCompressor\Core($options);

where $options is an associative array with one or more keys listed below.

Current List of All Possible Options

The following options allow you to exclude certain CSS/JS files and/or inline snippets.

NOTE: these options only apply if compression is enabled for CSS/JS files.*



The following options can be used to setup custom cache directories/URLs.

NOTE: under most circumstances, the built-in default values will do just fine.


The following options can be used to specify the current URL.

NOTE: it is normally NOT necessary to supply any of these values.


The following options control compression behavior.

NOTE: compression routines are applied in the same order as these options are listed below.


Other misc. options. These don't really fall into any specific category yet.