Open dougransom opened 2 years ago
be even better if i could download a pandas dataframe for all symbols, as one file, via torrent or http. a seperate for 15m, 30m etc. data.
thank you for the suggestion, we will review it.
As i think about this, this could work really well with your existing scripts. Publish the torrent every quarter for each interval (i.e. 1m, 5m, etc). To get the most current data, we would just have to run the existing scripts and it would bring our local copy up to the current one.
Since redirect downloading is good enough for now, we will not consider the torrent solution recently.
S3 is fast enough. All zip files are hosted in AWS S3, from where users can easily to download at a reasonable speed. They can either download specific files manually or write scripts to download them programmatically ( we provide the example scripts here). If clients choose to download to their S3 bucket, the speed is pretty good, can be better than torrent.
Download new files can be also convenient. For now, we have:
Thanks for your suggestion anyway.
The experience has been painful for me, i am been trying for days to download 15m kline data for all symbols and I have a fast pipe. The smallest interruption for whatever mystery happens across the internet and I have to run the scripts again, and they start from the beginning of course indicating many file not found or file already exist errorrs..
Also, amazon S3 has some torrent features built in, so you just have to set them up and document them.
https://docs.aws.amazon.com/AmazonS3/latest/userguide/uploading-downloading-objects.html
Then we just run the scripts as you suggest to update our local copy.
So if we could prime our local copy with torrent (it would at least make sure the whole tree downloads) and then fill in the gaps
@dougransom One solution might be to turn the checksum download option on, add some code to the scripts to check for an existing file, if it exists, verify the checksum, and if it matches, skip downloading that file. There will be some delay to this verification, but in most cases the verification would be faster than the download (especially if you're able to store the data on an SSD).
Hi guys. How to download all files at once? For example, all csv files of the klines section?Thanks.
Hi guys. How to download all files at once? For example, all csv files of the klines section?Thanks.
It's recommended to download one file each time, it should be easier.
When I use download-kline.py to download multiple kline files at the same time, I often encounter this error: urllib.error.URLError: <urlopen error [WinError 10054], is the reason for this error because the download frequency is too high? Do I need to add sleep to the code? Or what parameters to add to the command line instructions?
Request a torrent download to updated each month for each combination of (spot,futures) (agg,kline,trade) that would have the data for all symbols, for (all time, the previous month). Then users could quickly have a copy of the data for backtesting.