JuliaStats / TimeSeries.jl

Time series toolkit for Julia
Other
352 stars 69 forks source link

Large performance improvement when parsing custom date formats #471

Closed jbaron closed 3 years ago

jbaron commented 3 years ago

By creating a DataFormat only once, the performance of parsing custom date and datetime formats is speed up a lot (this is also inline with Julia recommendations).

In my own tests the time spent reading several CSV files went from over 400 seconds to just 8 seconds (intraday stock quotes), so 50x improvement.

jbaron commented 3 years ago

fixed compatibility issue with Julia 1.0

iblislin commented 3 years ago

oh, by the way, is CSV.File suitable for your case? (https://juliastats.org/TimeSeries.jl/dev/tables/#Load-a-TimeArray-from-csv-file-via-CSV.jl-1)

jbaron commented 3 years ago

I guess I have to re-evaluate CSV.jl. In the past when I tried it, it was actually slower for my use-case and did pull in the whole DataFrame package. I noticed however that the last master branch doesn't refer anymore to DataFrame and possible also some performance bottlenecks were solved.

Will definitely check it out since it is a flexible parser and it could remove some of my custom code to deal with that.