mortada / fredapi

Python API for FRED (Federal Reserve Economic Data) and ALFRED (Archival FRED)
Apache License 2.0
902 stars 159 forks source link

FRED API anomolies #69

Open leaderanalytics opened 7 months ago

leaderanalytics commented 7 months ago

I wrote a client for FRED like fredpi except I used c#/dotnet. In the course of writing my client I came across some ambiguities in the FRED documentation. I also found some cases where the data I received from the API was not consistent with the documented API.

I wrote the FRED team several times and their responses only raised more questions. I eventually wound up creating my own definitions and writing my code against my own spec. I have long wondered, however, if I am not understanding the documentation correctly or if perhaps I am not using the API in the way it is intended to be used. After running a few queries with fredapi I see some cases where the same the same anomalies I found in the API are apparent in the the results returned by fredapi.

1.) Why does get latest data known on a given date return multiple rows for a given observation date? I would expect that since we are querying for latest we would only see one row e.g. the latest one. For the example shown here I would expect to see only these rows:

2013-10-01 2014-03-27 17089.6 2014-01-01 2014-05-29 17101.3

2.) Not all vintage dates are returned for a realtime query that spans multiple vintage periods:

fredapi documentation:

get_series_all_releases(self, series_id, realtime_start=None, realtime_end=None) Get all data for a Fred series id including first releases and all revisions. This returns a DataFrame with three columns: 'date', 'realtime_start', and 'value'. For instance, the US GDP for Q4 2013 was first released to be 17102.5 on 2014-01-30, and then revised to 17080.7 on 2014-02-28, and then revised to 17089.6 on 2014-03-27. You will therefore get three rows with the same 'date' (observation date) of 2013-10-01 but three different 'realtime_start' of 2014-01-30, 2014-02-28, and 2014-03-27 with corresponding 'value' of 17102.5, 17080.7 and 17089.6

This query:

fred.get_series('GDP', realtime_start='2014-01-28', realtime_end='2014-03-30', observation_start='2013-10-01', observation_end='2013-10-01')

returns only one result:

2013-10-01    17089.6

Based on the spreadsheet for GDP downloaded from Alfred, three vintages for observation period 2013-10-01 were released between 2014-01-28 and 2014-03-30:

2014-01-30  17102.5
2014-02-28  17080.7
2014-03-27  17089.6

Again note that the fredapi documentation says "...including first releases and all revisions" therefore I would expect to see all three vintages. Also, note that the date returned by the query is not a vintage date but instead appears to be the observation period.

Is the result of this query correct? Based on FRED documentation, result from the query above demonstrates what I believe is a defect in the FRED API (not a defect in fredapi). I believe FRED has co-mingled the concepts of realtime periods and vintage dates. As a result, queries such as the one above return meaningless or inconsistent results. I posted a fairly in-depth analysis of my findings I will provide a link if requested. Capture