twbell / GPLplanet

Open Source PHP library for creating and employing a local instance of Yahoo GeoPlanet
Other
35 stars 10 forks source link

GPLPlanet

Notice: See WOEPlanet (https://woeplanet.org/about/) for something more up-to-date and better.

INTRODUCTION

GPLplanet: an open source PHP library to assist in employing local instances of Yahoo GeoPlanet(tm) in production.

"Yahoo! GeoPlanet is a resource for managing all geo-permanent named places on Earth. It provides the geographic developer community with the vocabulary and grammar to describe the world's geography in an unequivocal, permanent, and language-neutral manner. Developers can geo-enable their applications by using GeoPlanet to traverse the global spatial hierarchy, identify the geography relevant to their users and their businesses, and in turn, unambiguously geotag, geotarget, and geolocate data across the Web" http://developer.yahoo.com/geo/geoplanet/

Yahoo makes GeoPlanet available as both a Web service and data download.
Due to call-per-day restrictions, latency concerns, or the need for offline use, small businesses and startups may be either unwilling or unable to create their core geographic platforms on Yahoo Web services. This library uses a local version of GeoPlanet to obtain similar functionality, in a local environment, allowing developers easy access to GeoPlanet in a PHP class with access to Web services where required. In a nutshell:

These libraries are intended to make a local instance of GeoPlanet more accessible and easier to understand before implementing in production. If you are looking for a simple GeoPlanet Web-service-only wrapper, see Tyler Hall's http://github.com/tylerhall/php-geoplanet/. Also consider using Chris Heilmann's 'GeoPlanet Explorer' website for an easy, ad-hoc woeid and placename lookup: http://isithackday.com/geoplanet-explorer/

The Geoplanet data dump (http://developer.yahoo.com/geo/geoplanet/data/) was recently taken offline by Yahoo. The most recent version of the file remains, and always will be, available via the Internet Archive (http://archive.org/details/geoplanet_data_7.10.0.zip). This repo includes an SQL version of Geoplanet 7.10.0 optimised for this codebase.

Lastly: a reminder that the geoplanet data dump does not contain coordinates, which must be obtained from the GeoPlanet web service directly, or another third party.

GETTING STARTED

Create and Populate Database

gunzip [path/to/gplplanet]/import/gplplanet.sql.zip
mysql create database geo
mysql -u [username] -p --max_allowed_packet=1GB geo < [path/to/]gplplanet.sql

or Import the Geoplanet TSV files

(see below)

Default database is 'geo'. You can select any database name, but ensure that it is configured in config.ini

METHOD EXAMPLES

Require the geoengine class and get an instance thereof:

require_once('class.geoengine.php');            
$engine = geoengine::getInstance();             //geoengine is a factory singleton

Start geoplaneting:

$engine->getChildren(31278);                    //children of woeid 31278 (Oxford, UK)
$engine->getParent(31278);                      //parent of woeid 31278
$engine->getElevationByWOEID(2461928);          //elevation of woeid 2461928 (Northfield, MN) -- (uses web service)
$engine->getByName("springfield","UK");         //all places called "Springfield" in UK
$engine->disambiguate("springfield");           //the most likely 'Springfield'
$engine->geocode("1 infinity loop, Cupertino, ca"); //geocodes -- (uses web service)
$engine->reverseGeocode(0.151490,52.145329);    //reverse geocodes -- (uses web service)
$engine->getBbox(727232);                       //bounding box of Amsterdam -- (uses web service)
$engine->filterByType($engine->getDescendants(23424977),7); //all towns in US
$engine->getAdjacencies(2468964);               //all entities surrounding woeid 2468964 (Pasedena, CA)
engine->getGeo(2468964);            //get an object representation of woeid 2468964     

GEO OBJECTS

Geo objects are lightweight entities that share a common geoengine singleton -- basically a factory class -- where the bulk of the code (and processing power) lies. They are intended to be employed as handy encapsulations but are not required. All methods return WOEIDs as INT or arrays of INT, which can be instantiated as geo objects.

    $engine->getGeo(INT);               //instantiates single woeid as geo object
    $engine->getMultiGeo(ARRAY);        //instantiates multiple woeids as array of geo objects

Geo objects contain mostly identical methods using the same naming convention as the geoengine factory class:

    $engine->getParent(31278);

serves the same purpose as:

    $geo = $engine->getGeo(31278);  
    $geo->getParent();

and is equally efficient.

USE OF EXTERNAL WEB SERVICES

Some methods call Yahoo and other web services via YQL. The results of these calls are cached so (for example) if you request the Bounding box of an entity, the webservice will not be called if request a second time. The general idea is to seamlessly incorporate the webservice calls only where required.

COMMAND LINE SCRIPTS

Example command line scripts live in the scripts folder:

    php children.php 12345  //returns children of woeid 12345 as JSON array
    php get.php 12345   //returns JSON hash representation of woeid 12345

DIR OVERVIEW

REQUIRED LIBS

Not tested with version 4 of either one.

IMPORTING GEOPLANET DATA

  1. Add database vars to config.ini
  2. Download geoplanet data from http://developer.yahoo.com/geo/geoplanet/data/
  3. Assign tsv filenames to the variables in import.php
  4. Run import.php from the command line (e.g. "php import.php")

Import Notes:

TODO

SOURCE

/**
 * @package gplplanet
 * @author Tyler Bell tylerwbell[at]gmail[dot]com
 * @copyright (C) 2009-2012 - Tyler Bell
 * @license GNU General Public License
 */