ezkl / sniffles

Detects popular CMS, Javascript libraries, and other items of interest.
http://documentup.com/ezkl/sniffles
MIT License
35 stars 8 forks source link

Sniffles

Description

Sniffles parses HTML pages and searches for common patterns suggesting a page is using a popular CMS or advertising platform as well as CSS and JS libraries.

The master branch is continuously tested against Rubies 1.9.3, 2.0.0, 2.1.0 thanks to the fantastic Travis-CI.

Current CI status: Build Status

What is a sniffer?

This library uses the term sniffer to refer to a pattern that determines where a page is using a particular platform or library. A sniffer may also include methods to extract other metadata once a platform or library has been identified.

Work in progress!

Sniffles should be considered a work in progress. Many of the matching patterns are little more than regular expressions matching commonly found "Powered by" text.

If you find a bug or want to add a feature to a sniffer, open an issue! The URL of an example page that Sniffles misidentifies is help. Pull requests are, of course, greatly appreciated.

Installation

Rubygems:

gem install sniffles

Bundler:

gem 'sniffles'

Usage

require 'sniffles'
require 'typhoeus' # Optional: Use your favorite HTTP client.

response = Typhoeus::Request.get("http://www.fastly.com/")

You can pass in a single sniffer:

Sniffles.sniff(response.body, :google_analytics) 
# => { :google_analytics=> { :found=>true, :ua=>"UA-25770359-1" } }

Or multiple sniffers:

Sniffles.sniff(response.body, :google_analytics, :kissmetrics)
# => { :google_analytics=> { :found=>true, :ua=>"UA-25770359-1"}, :kissmetrics=>{:found=>false} }

Or an entire group of sniffers:

Sniffles.sniff(response.body, :analytics)
# => {:chartbeat=>{:found=>false},
# :clicky=>{:found=>false},
# :google_analytics=>{:found=>true, :ua=>"UA-185209-2"},
# :kissmetrics=>{:found=>false},
# :mixpanel=>{:found=>false},
# :quantcast=>{:found=>false}}

Sniffers (HEAD)

Here are a list of currently implemented sniffers, grouped by category. You can see a list of unimplemented sniffers by filtering issues by "sniffer".

Advertising

Analytics

CMS

Javascript

Contributors

For a complete list see github.

Special Thanks