pydata / pandas-datareader

Extract data from a wide range of Internet sources into a pandas DataFrame.
https://pydata.github.io/pandas-datareader/stable/index.html
Other
2.94k stars 681 forks source link

wb.download country: add option to exclude aggregates #754

Open data-coder opened 4 years ago

data-coder commented 4 years ago

It would be nice if we could pass an option to wb.download country to exclude aggregates. When using 'all' I'm also getting aggregates data like 'Word' and 'Euro area'.

wb.get_countries()['region'].unique()
array(['Latin America & Caribbean ', 'South Asia', 'Aggregates',
       'Sub-Saharan Africa ', 'Europe & Central Asia',
       'Middle East & North Africa', 'East Asia & Pacific',
       'North America'], dtype=object)

Or pease let me know if this is already implemented. I took a loot at https://github.com/pydata/pandas-datareader/blob/master/pandas_datareader/wb.py but coudn't find it.

MaxGhenis commented 4 years ago

My workaround is to take Afghanistan on, since the regions are at the top. It'd be great to avoid this!

raw = wb.download(indicator='SP.POP.TOTL',
                  country='all', start=2016, 
                  end=2016).reset_index()
first_non_agg_ix = raw.index[raw.country == 'Afghanistan'].tolist()[0]
dat = raw.iloc[first_non_agg_ix:]