OliverSherouse / wbdata

A python library for accessing world bank data
GNU General Public License v2.0
180 stars 55 forks source link

date kwarg does not seem to work for get_dataframe() #71

Open econstack opened 4 months ago

econstack commented 4 months ago

Thank you for the update! After updating kwarg data_date to date for the get_dataframe() method, it does not seem to accept tuples as datetime objects or as strings. Both wbdata.get_dataframe({"NY.GDP.MKTP.CD": "value"}, date=["2000", "2022"]) and

start_year = datetime(2000, 1, 1)
end_year = datetime(2022, 1, 1)
df = wbdata.get_dataframe({"NY.GDP.MKTP.CD": "value"}, date=[start_year, end_year])

yield error TypeError: expected string or bytes-like object with additional error logging point to this line in the _parse_date method:

    if PATTERN_YEAR.fullmatch(date):
econstack commented 4 months ago

Additional information. I reverted to version 0.3 and it seems that the the kwargs country and data_date also no longer works.

OliverSherouse commented 4 months ago

Thank you for the report, I can reproduce this behavior and will investigate further. I'm surprised that you had any trouble with version 0.3, since no changes were made to that version. As a workaround, you should be able to pull the whole series as a dataframe and slice down the the countries and years you're looking for.

OliverSherouse commented 4 months ago

Bit of an update here. It seems that some of the problem was upstream, and has now been fixed. Unfortunately, wbdata isn't properly handling the dates as a list. But if you specify them as a tuple, your examples work:

In [20]: wbdata.get_dataframe({"NY.GDP.MKTP.CD": "value"}, date=("2000", "2022"))
Out[20]: 
                                         value
country                     date              
Africa Eastern and Southern 2022  1.185138e+12
                            2021  1.086531e+12
                            2020  9.288802e+11
                            2019  1.006191e+12
                            2018  1.012521e+12
...                                        ...
Zimbabwe                    2004  5.805598e+09
                            2003  5.727592e+09
                            2002  6.342116e+09
                            2001  6.777385e+09
                            2000  6.689958e+09

[6118 rows x 1 columns]

So that is your new workaround until I can fix this.

OliverSherouse commented 4 months ago

I realized in looking at this that it isn't exactly a bug: that parameter is supposed to only take either a string, datetime, or 2-tuple of either. But it would probably be friendly to allow for lists since I doubt you're the only person who has tried this.