evangambit / JsonOfCounties

A repo containing various data (demographics, employment, etc.) in JSON form.
59 stars 11 forks source link

Request to remove the "unemployment_rate" keys for kalawao county inside county.json #2

Open ejohnson-amerilife opened 2 years ago

ejohnson-amerilife commented 2 years ago

Within the "bls" field, kalawao county still has "unemployment_rate" keys with "None" as the value. Here is the bls data for kalawao county: {'2004': {'labor_force': None, 'employed': None, 'unemployed': None, 'unemployment_rate': None}, '2008': {'labor_force': None, 'employed': None, 'unemployed': None, 'unemployment_rate': None}, '2012': {'labor_force': None, 'employed': None, 'unemployed': None, 'unemployment_rate': None}, '2016': {'labor_force': None, 'employed': None, 'unemployed': None, 'unemployment_rate': None}, '2020': {'labor_force': None, 'employed': None, 'unemployed': None, 'unemployment_rate': None}}

evangambit commented 2 years ago

Presuming this is related to https://github.com/evangambit/JsonOfCounties/issues/1 it's worth pointing out that even if I remove these None values, I still get this error (and the SO solution fails with its own error).

I have an alternate solution:

def flatten_json(j, r = None, prefix = [], delimiter = '/'):
    if r is None:
        r = {}
    for k in j:
        assert delimiter not in k, k
        if type(j[k]) is dict:
            flatten_json(j[k], r, prefix + [k])
        else:
            r[delimiter.join(prefix + [k])] = j[k]
    return r

# ...
if __name__ == '__main__':
    # ...
    df = pd.json_normalize([flatten_json(county) for county in counties])
    df.to_csv('counties.csv', index=False)

which seems to work fine google sheet link

Wondering what you think of that.