Open c0indev3l opened 1 year ago
nice, right now it's plain tsv files, I was thinking to index the files to be able to use cached data if overlapping with the requested time window, but I'd gladly uses the correct tools if already existing
data pipelines / ETL (extract transform and load) is probably the way to follow. Unfortunately I'm not DevOps. Docker / docker-compose is probably also required (and I'm quite beginer in this part also)
but the data source is working already, it's not optimized, but it useable to some extents, there are many other aspect of the tools than needs to be improved/fixed. Are you running/testing the tool?
I'm considering it. But haven't used it currently
Hello,
for storing historical data you may be interested in using a timeseries database.
Here is some code to download data from Binance
using https://pypi.org/project/binance-historical-data/
Store data into an InfluxDB database
using https://github.com/influxdata/influxdb-client-python
Retrieve data as Pandas DataFrame
I'm still facing an issue https://github.com/influxdata/influxdb-client-python/issues/592
Maybe an other TSDB should be considered ? TimescaleDB for example.
An other approach could be to simply store data as Parquet or Feather files (or an other format) into a hierarchical directory
Kind regards