Gorcenski / women-streets-berlin

Exploring the women's history hidden in the street names of Berlin
Other
0 stars 0 forks source link

Build an automated processing pipeline #6

Open Gorcenski opened 6 years ago

Gorcenski commented 6 years ago

The data processing pipeline should perform the following steps:

This can probably be accomplished with the development of a shell script and a basic python script. I haven't yet decided on the output format yet, so that remains to be determined.

Gorcenski commented 6 years ago

I've added a script that will extract the data from the OSM files and put it into a tab delimited format. Next step is to write a python script that will do some data handling and output it to JSON or some other slightly more workable format, and then integrate this script into the extraction pipeline.

I've also decided to put both source and processed files into the data folder in the repo. By including the source files, the user can work the pipeline themselves, and by including the processed files, they won't have to.

Gorcenski commented 6 years ago

Updated this with a more comprehensive checklist about what this pipeline entails