Sniffles parses HTML pages and searches for common patterns suggesting a page is using a popular CMS or advertising platform as well as CSS and JS libraries.
The master branch is continuously tested against Rubies 1.9.3, 2.0.0, 2.1.0 thanks to the fantastic Travis-CI.
This library uses the term sniffer to refer to a pattern that determines where a page is using a particular platform or library. A sniffer may also include methods to extract other metadata once a platform or library has been identified.
Sniffles should be considered a work in progress. Many of the matching patterns are little more than regular expressions matching commonly found "Powered by" text.
If you find a bug or want to add a feature to a sniffer, open an issue! The URL of an example page that Sniffles misidentifies is help. Pull requests are, of course, greatly appreciated.
Rubygems:
gem install sniffles
Bundler:
gem 'sniffles'
require 'sniffles'
require 'typhoeus' # Optional: Use your favorite HTTP client.
response = Typhoeus::Request.get("http://www.fastly.com/")
You can pass in a single sniffer:
Sniffles.sniff(response.body, :google_analytics)
# => { :google_analytics=> { :found=>true, :ua=>"UA-25770359-1" } }
Or multiple sniffers:
Sniffles.sniff(response.body, :google_analytics, :kissmetrics)
# => { :google_analytics=> { :found=>true, :ua=>"UA-25770359-1"}, :kissmetrics=>{:found=>false} }
Or an entire group of sniffers:
Sniffles.sniff(response.body, :analytics)
# => {:chartbeat=>{:found=>false},
# :clicky=>{:found=>false},
# :google_analytics=>{:found=>true, :ua=>"UA-185209-2"},
# :kissmetrics=>{:found=>false},
# :mixpanel=>{:found=>false},
# :quantcast=>{:found=>false}}
Here are a list of currently implemented sniffers, grouped by category. You can see a list of unimplemented sniffers by filtering issues by "sniffer".
For a complete list see github.