jonathancornelissen / highfrequency

The highfrequency package contains an extensive toolkit for the use of highfrequency financial data in R. It contains functionality to manage, clean and match highfrequency trades and quotes data. Furthermore, it enables users to: calculate easily various liquidity measures, estimate and forecast volatility, and investigate microstructure noise and intraday periodicity.
147 stars 63 forks source link

POSIXct is very slow #65

Closed waynelapierre closed 3 years ago

waynelapierre commented 3 years ago

POSIXct is very slow. I highly recommend this package to use nanotime on top of POSIXct. https://cran.r-project.org/web/packages/nanotime/

emilsjoerup commented 3 years ago

What function are you using since as.POSIXct is causing such a big bottleneck? Can you provide some code that suffers from such a slow down?

I would expect the vast majority of time to be spent elsewhere in the code for most of our functions. Also, it seems fastposixct and nanotime are faster than as.POSIXct with character input, is this also the case for numeric input (which is almost exclusively what we use).

waynelapierre commented 3 years ago

My understanding is that when I supply a data.table to the functions in your package, it needs to contain a timestamp column with a POSIXct format. I am converting a very long character column to POSIXct. Thanks for pointing the fastposixct function, I will try it.

emilsjoerup commented 3 years ago

If you have character column for your timestamps, definitely go for fastposixct, it's incredibly fast. Unfortunately we can't go for that as we have to accomodate numeric input too, which as far as I know is not supported in fastposixct.