Open fipelle opened 3 years ago
Hi Filippo, thanks for the idea! I welcome contributions, yes please do open a pull request and I can work with you to see how we can past make this functionality available.
So if I understand correctly, a user would need to provide their own unrevised data from an external source (i.e. their own spreadsheet)? Could this data instead be constructed from one of the FRED API endpoints? One of the open issues in FredData.jl is to provide support for other endpoints (#13).
There is also a longstanding open issue to provide support for what my colleague has called "pseudo-vintages" (#11) and for which there is a linked MATLAB implementation. How close is this to what you are thinking of?
Hi Micah,
So if I understand correctly, a user would need to provide their own unrevised data from an external source (i.e. their own spreadsheet)?
Yes, that's correct. Currently, the user must provide:
I reckon this might not be the best way to implement it within a registered package. Indeed:
I think it might be best to implement it in such a way that:
Could this data instead be constructed from one of the FRED API endpoints? One of the open issues in FredData.jl is to provide support for other endpoints (#13).
While it might work for data available on FRED, this might be limiting for users. For instance, quite a few interesting unrevised surveys / indices are not available on FRED.
There is also a longstanding open issue to provide support for what my colleague has called "pseudo-vintages" (#11) and for which there is a linked MATLAB implementation. How close is this to what you are thinking of?
It is not super far, even though the code is currently not supporting it. At the moment the code is creating two DataFrames (respectively from FRED and the external source described above), transforming the data when needed (for instance, to remove the effect of a change in the base year) and merging them together with an outer join on the release dates.
In order to allow for the pseudo-vintages, I suspect we would need to update the release dates column for the FRED DataFrame, using some external calendar. This should involve an additional keyword argument in the relevant function.
Ideally, I should be able to re-write what I have in the form of a small package in a few days and we can start from there. Given personal time constraints -- I am finishing my PhD thesis -- we could release a first version without the pseudo-vintages support soonish (in 1-2 weeks?) and work on the pseudo-vintages support at some point after the summer break.
I forgot to ask: which branch should I fork?
For development, please see a few notes here: https://micahjsmith.github.io/FredData.jl/dev/contributing/ Forking happens at the level of the entire repository; once you have created a fork, you can create a branch in your own copy of the repository with a short descriptive name.
Okay, I think I have a better understanding now of the scope of what you propose. But also perhaps before/as you are getting started, you could share some sample real-time datasets with inputs/outputs you have created using this method? Can email me, attach files directly to an issue comment, or paste a subset of the rows into the issue comment code block.
I think the functionality of merging the FRED output with unrevised data and list of release dates sounds super useful. But I'm thinking that it might actually be too general-purpose of a routine for this package? The goal of FredData.jl is pretty narrowly to expose the functionality provided by the FRED API within Julia. So I'm thinking that what you propose may be best as (1) an example committed under /docs/src and shown in the FredData.jl documentation site or (2) a separate package. But perhaps I'd have a better understanding after seeing some sample inputs.
Thanks. Will do! However, I need to write a simplified version of what I currently have first. I am using it for a series of specialised projects and it might be confusing as it is. It shouldn't take long though - just a few days.
But I'm thinking that it might actually be too general-purpose of a routine for this package?
While I agree in principle, I am not entirely convinced. At the end of the day, if you are working with real-time economic data and Julia, there is a high chance that you will also be looking into the FredData.jl
routines first. Having the option of including external unrevised data (e.g., PMIs, stock price indices) into a real-time dataset would certainly be handy for researchers.
However, if you feel strongly it should not be included in FredData.jl
, maybe creating a separate package might be best. We could name it in a way that recalls FredData.jl
and consider it part of the FRED Data environment.
I am sending you a JLD output with an array of data vintages and the release dates for each vintage via email. I have structured the data vintages as a DataFrame at the end, so that it should be easier to understand what's inside.
Hi,
I have written a small piece of code that generates multivariate real-time vintages merging
FredData.jl
output and unrevised data (stored in an Excel file). I am unsure on whether I should register a new package or open a pull request. Would you be open on the latter?