Gorcenski / women-streets-berlin

Exploring the women's history hidden in the street names of Berlin
Other
0 stars 0 forks source link

Write a python script to post-process street data and output it to JSON #7

Closed Gorcenski closed 6 years ago

Gorcenski commented 6 years ago

The pipeline is basically:

Download source Geo and name-gender data > process Geo data > merge with name-gender data > place-gender correlation data.

For this last step, a python script will do the place-gender correlation and will output the data in a more generalized data model, in this case, a JSON file that can be used to further generate the data in a more suitable markdown file or something similar.

This will be the final step in the automated extraction pipeline.

Gorcenski commented 6 years ago

Ok, this step took a lot of digging into OSM data formats, which was eventually required, and I was able to put together a jupyter notebook that can do this. I'll put that code into a script.

Right now I'm just extracting street names and correlating it with gender info. Per-street geo data will be a different thing and will be slightly more complex, as the OSM data requires a bit more JB Weld to get the data how I'll want it.

Gorcenski commented 6 years ago

For the initial work, this is completed in https://github.com/Gorcenski/women-streets-berlin/commit/7955c6df0d990e0feb644ff459f89e0e1964816e