Open glatterf42 opened 3 months ago
@danielhuppmann Feel free to close this whenever you consider this migration done. Maybe opening this issue was never needed in the first place since #827 already migrated the read_worldbank()
function.
Yes, my bad - I didn't get that you already fixed the issue in your PR. But let's leave this topic open anyway as a reminder to revisit the WorldBank-integration feature and see if the unit-issue can be fixed.
As it is, I'm not sure there is an officially supported way of retrieving the unit from the WorldBank data. For example:
>>> indicator = "NY.GDP.PCAP.PP.KD"
>>> new = wbdata.get_dataframe(indicators={indicator: "GDP"},country=["CAN", "MEX", "USA"],date=("2003", "2005"))
>>> new
GDP
country date
Canada 2005 44683.764981
2004 43704.669134
2003 42791.094678
Mexico 2005 19144.014627
2004 19017.753814
2003 18634.896456
United States 2005 54331.658336
2004 52989.030694
2003 51497.734688
>>> new.index
MultiIndex([( 'Canada', '2005'),
( 'Canada', '2004'),
( 'Canada', '2003'),
( 'Mexico', '2005'),
( 'Mexico', '2004'),
( 'Mexico', '2003'),
('United States', '2005'),
('United States', '2004'),
('United States', '2003')],
names=['country', 'date'])
>>>
>>> new.columns
Index(['GDP'], dtype='object')
>>> result = wbdata.get_indicators(indicator)
>>> result
id name
----------------- ---------------------------------------------------
NY.GDP.PCAP.PP.KD GDP per capita, PPP (constant 2017 international $)
>>> raw = wbdata.get_data(indicator, country=["CAN", "MEX", "USA"],date=("2003","2005"))
>>> raw
[{'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'CA', 'value': 'Canada'}, 'countryiso3code': 'CAN', 'date': '2005', 'value': 44683.764981042, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'CA', 'value': 'Canada'}, 'countryiso3code': 'CAN', 'date': '2004', 'value': 43704.6691337093, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'CA', 'value': 'Canada'}, 'countryiso3code': 'CAN', 'date': '2003', 'value': 42791.0946777734, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'MX', 'value': 'Mexico'}, 'countryiso3code': 'MEX', 'date': '2005', 'value': 19144.014627364, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'MX', 'value': 'Mexico'}, 'countryiso3code': 'MEX', 'date': '2004', 'value': 19017.7538141902, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'MX', 'value': 'Mexico'}, 'countryiso3code': 'MEX', 'date': '2003', 'value': 18634.8964558406, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'US', 'value': 'United States'}, 'countryiso3code': 'USA', 'date': '2005', 'value': 54331.6583361399, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'US', 'value': 'United States'}, 'countryiso3code': 'USA', 'date': '2004', 'value': 52989.0306944184, 'unit': '', 'obs_status': '', 'decimal': 0}, {'indicator': {'id': 'NY.GDP.PCAP.PP.KD', 'value': 'GDP per capita, PPP (constant 2017 international $)'}, 'country': {'id': 'US', 'value': 'United States'}, 'countryiso3code': 'USA', 'date': '2003', 'value': 51497.7346884645, 'unit': '', 'obs_status': '', 'decimal': 0}]
What we'd probably want to have as the unit, 'PPP (constant 2017 international $)'
, is part of 'name'
or 'value'
and the 'unit'
key that exists for raw data is empty. Might be worth opening an issue with https://github.com/OliverSherouse/wbdata/tree/master if this is a feature we want to see.
Possible related to #815.
827 migrates from pandas-datareader to wbdata and implements some changes needed for that.