Open MaxGhenis opened 5 years ago
Here's my workaround: since aggregate geos are returned first, I get the index of the final aggregate geo ("World") and remove all geos with that index or lower.
Example:
df = wbdata.get_dataframe({'SP.POP.TOTL': 'pop'}).reset_index()
geos = pd.Series(df.country.unique())
world_index = geos[geos == 'World'].index[0]
aggs = geos[:world_index+1]
df[~df.country.isin(aggs)].head()
country | date | pop |
---|---|---|
Afghanistan | 2018 | NaN |
Afghanistan | 2017 | 35530081.0 |
Afghanistan | 2016 | 34656032.0 |
Afghanistan | 2015 | 33736494.0 |
Afghanistan | 2014 | 32758020.0 |
Not at the moment, that's how the WB API handles things. I suppose we could build that in manually without too much trouble by indicating a special code that means "actually just countries". Or we could have a constant. The difficulty there is that we'd ideally want to be able to identify which "countries" are aggregates at runtime. I'll noodle on that.
Another workaround is to use [i for i in wbdata.get_country() if not i['incomeLevel']['value'] == "Aggregates"]
; that seems to be fairly comprehensive. I'll consider adding that as a utility in the next version.
I'd like to get data on all countries, and exclude aggregates. Is there a way to do this, e.g. with
get_dataframe
?