paulmakepeace / refine-client-py

The OpenRefine Python Client Library provides an interface to communicating with an OpenRefine server.
https://github.com/PaulMakepeace/refine-client-py
GNU General Public License v3.0
176 stars 95 forks source link

=================================== OpenRefine Python Client Library

The OpenRefine Python Client Library provides an interface to communicating with an OpenRefine <http://openrefine.org/>_ server.

Currently, the following API is supported:

Configuration

By default the OpenRefine server URL is http://127.0.0.1:3333 The environment variables OPENREFINE_HOST and OPENREFINE_PORT enable overriding the host & port.

In order to run all tests, a live Refine server is needed. No existing projects are affected.

Installation

(Someone with more familiarity with python's byzantine collection of installation frameworks is very welcome to improve/"best practice" all this.)

. Install dependencies, which currently is urllib2_file:

sudo pip install -r requirements.txt

(If you don't have pip visit pip-installer.org <http://www.pip-installer.org/en/latest/installing.html#install-or-upgrade-pip>_)

. Ensure you have a Refine server running somewhere and, if necessary, set

the environment vars as above.

. Run tests, build, and install:

python setup.py test # to do a subset, e.g., --test-suite tests.test_facet

python setup.py build

python setup.py install

There is a Makefile that will do this too, and more.

TODO

The API so far has been filled out from building a test suite to carry out the actions in David Huynh's Refine tutorial <http://davidhuynh.net/spaces/nicar2011/tutorial.pdf>_ which while certainly showing off a wide range of Refine features doesn't cover the entire suite. Notable exceptions currently include:

Contribute

Pull requests with passing tests welcome! Source is at https://github.com/PaulMakepeace/refine-client-py

Useful Tools

One aspect of development is watching HTTP transactions. To that end, I found Fiddler <http://www.fiddler2.com/> on Windows and HTTPScoop <http://www.tuffcode.com/> invaluable. The latter won't URL-decode nor nicely format JSON but the Online JavaScript Beautifier <http://jsbeautifier.org/>_ will.

History

OpenRefine used to be called Google Refine, and this library used to be called the Google Refine Python Client Library.

Credits

Paul Makepeace, author, paulm@paulm.com

David Huynh, initial cut <http://markmail.org/message/jsxzlcu3gn6drtb7>_

Artfinder <http://www.artfinder.com/>_, inspiration

Some data used in the test suite has been used from publicly available sources,