tidyverts / tsibble

Tidy Temporal Data Frames and Tools
https://tsibble.tidyverts.org
GNU General Public License v3.0
528 stars 50 forks source link

dplyr-verbs & Building a superclass of tsibble #252

Closed yogat3ch closed 3 years ago

yogat3ch commented 3 years ago

Hi @earowang and the {tsibble} crew!

{tsibble} is deeply integrated into the (nearly done) revision of {AlpacaforR} and I'm grateful to you all for developing a package that perfectly suits dealing with the idiosyncrasies of timeseries data! My intention is to develop a superclass of the {tsibble} class that carries with it two new attributes symbol, a character vector with a symbol (CUSIP) label, and query that contains API query metadata. At present, the additional attributes are stripped when using dplyr verbs but the tsibble attributes are retained. In order to retain these new attributes, I'll need to replicate much of the functionality found in the dplyr-verbs. I'm wondering what approach to accomplish this you all are most amenable to:

  1. exporting the dplyr-verbs such that I can include them as dependencies?
  2. copy-pasting the existing code and modifying?

Thanks in advance for your advice!

mitchelloharawild commented 3 years ago

You may like to look at the {fabletools} package. The <fable> class sounds similar to what you are trying to achieve, where it holds onto some new attributes for identifying the distribution column.

The starting point would be to integrate your new dataset with {vctrs}, which has been done for <fable> here: https://github.com/tidyverts/fabletools/blob/master/R/vctrs-fable.R You may also need to integrate with some {dplyr} verbs depending on how you want them to behave with your superclass: https://github.com/tidyverts/fabletools/blob/master/R/dplyr-fable.R

yogat3ch commented 3 years ago

Thanks for the guidance @mitchelloharawild! fable looks like it has functionality for holding onto attributes related to model building and forecasting, which I can see the usefulness of for users of AlpacaforR. However, I'm currently quite far up the tsibble tree with the package at present, the only attributes I'm adding are informative rathern than functional, and pivoting to use fable instead feels daunting. The starting point and process described above seems like it could be done just the same using the tsibble vctrs methods and dplyr verbs, so I suppose I'll start there. Does that sound appropriate, or do you think there's a strong case for going ahead and revising to use fable instead?

mitchelloharawild commented 3 years ago

I'm not suggesting that you use fable for your work, rather it is a good example of how you could superclass tsibble. The code above is how I've added and held onto additional attributes with a tsibble, and I think some similar code would work well for your new data superclass.

yogat3ch commented 3 years ago

I'm not suggesting that you use fable for your work, rather it is a good example of how you could superclass tsibble.

Ah! Ok, thank you for the clarification.

The code above is how I've added and held onto additional attributes with a tsibble, and I think some similar code would work well for your new data superclass.

I've begun the process, we'll see how it goes! Thank you for the helpful suggestions!

yogat3ch commented 3 years ago

@mitchelloharawild Using the code in fabletools as template made the process of adding dplyr functionality so much easier! Much appreciation for the suggestion!