ckan / ideas

[DEPRECATED] Use the main CKAN repo Discussions instead:
https://github.com/ckan/ckan/discussions
40 stars 2 forks source link

Datastore extension for Bonobo #199

Open rossjones opened 7 years ago

rossjones commented 7 years ago

Bonobo is data processing toolkit for building ETL graphs in Python.

It would be really neat if there were a Datastore extension, in a similar vein to the opendatasoft extension so that users build pipelines like


#from https://github.com/python-bonobo/bonobo/blob/0.2/bonobo/examples/datasets/coffeeshops.py

from os.path import dirname, realpath, join

import bonobo
from bonobo.ext.opendatasoft import OpenDataSoftAPI

OUTPUT_FILENAME = realpath(join(dirname(__file__), 'coffeeshops.txt'))

graph = bonobo.Graph(
    OpenDataSoftAPI(dataset='liste-des-cafes-a-un-euro', netloc='opendata.paris.fr'),
    lambda row: '{nom_du_cafe}, {adresse}, {arrondissement} Paris, France'.format(**row),
    bonobo.FileWriter(path=OUTPUT_FILENAME),
)

if __name__ == '__main__':
    bonobo.run(graph)
    print('Import done, read {} for results.'.format(OUTPUT_FILENAME))
loleg commented 1 year ago

See also #211

rossjones commented 1 year ago

FWIW I think this repo is closed now, most conversation has moved to https://github.com/ckan/ckan/discussions