Open rossjones opened 7 years ago
Bonobo is data processing toolkit for building ETL graphs in Python.
It would be really neat if there were a Datastore extension, in a similar vein to the opendatasoft extension so that users build pipelines like
#from https://github.com/python-bonobo/bonobo/blob/0.2/bonobo/examples/datasets/coffeeshops.py from os.path import dirname, realpath, join import bonobo from bonobo.ext.opendatasoft import OpenDataSoftAPI OUTPUT_FILENAME = realpath(join(dirname(__file__), 'coffeeshops.txt')) graph = bonobo.Graph( OpenDataSoftAPI(dataset='liste-des-cafes-a-un-euro', netloc='opendata.paris.fr'), lambda row: '{nom_du_cafe}, {adresse}, {arrondissement} Paris, France'.format(**row), bonobo.FileWriter(path=OUTPUT_FILENAME), ) if __name__ == '__main__': bonobo.run(graph) print('Import done, read {} for results.'.format(OUTPUT_FILENAME))
See also #211
FWIW I think this repo is closed now, most conversation has moved to https://github.com/ckan/ckan/discussions
Bonobo is data processing toolkit for building ETL graphs in Python.
It would be really neat if there were a Datastore extension, in a similar vein to the opendatasoft extension so that users build pipelines like