danpaquin / coinbasepro-python

The unofficial Python client for the Coinbase Pro API
MIT License
1.82k stars 738 forks source link

Coinbase Pro Candles / Historical Data Endpoint Unreliable #338

Closed luyongxu closed 3 years ago

luyongxu commented 5 years ago

I am looking to download historical data using the Coinbase Pro's REST API candles endpoint:

GET /products//candles)

I have found that the response from this endpoint is unreliable. Sometimes the response contains data and sometimes the response does not contain data.

Steps to reproduce:

  1. Copy the following endpoint:

https://api.pro.coinbase.com/products/BTC-USD/candles?start=2014-07-20T00%3A00%3A00.0Z&end=2015-02-05T00%3A00%3A00.0Z&granularity=86400

  1. Open in browser of choice

  2. Refresh several times

  3. Observe that roughly 20 percent of the time, the response contains no data.

Can you explain what is happening?

yiwensong commented 5 years ago

this isn't really a clientlib issue, it's more of a coinbase API issue. coinbase support will probably have a better answer than the random people contributing to this python clientlib

TheKeyboardKowboy commented 5 years ago

The Coinbase API does state that it cannot always guarantee historical data:

https://docs.pro.coinbase.com/#get-historic-rates

Historical rate data may be incomplete. No data is published for intervals where there are no ticks.

yiwensong commented 5 years ago

@TheKeyboardKowboy I think the issue is slightly different than the one described. This describes querying the endpoint with the same parameters and only sometimes getting data.

noah222 commented 5 years ago

Your connection will be throttled if you refresh more than a few times per second. I find 1.5 seconds is a good wait time between requests to avoid temporary ban. It is advised to not poll this too often. When I was only waiting 1 second between requests, and trying to grab a year of data it would fail often. I will typically grab the data once and store it in a text file and in memory, then only updating with new data from the feed.

noah222 commented 5 years ago

The Coinbase API does state that it cannot always guarantee historical data:

https://docs.pro.coinbase.com/#get-historic-rates

Historical rate data may be incomplete. No data is published for intervals where there are no ticks.

All this means is there are some spots in the data where it does not have a timestamp for that moment in time. By scanning through the dataset you can find 'skips', places where the timestamp jumps ahead. For example if you are grabbing 1 minute data then the timestamp jumps forward more than 60 seconds. It means no trades happened in that interval and the price stayed the same. You can easily find these spots and 'interpolate' or basically copy data from the last candle and insert a data-point so you can have a complete data-set with no gaps. I find this is important if you are doing any kind of moving average analysis, since gaps in the data will skew the average a little bit.

Here is some sample code attached that loads history with very reliable output, dumped to a text file. Load_History_Coinbase.txt

mcardillo55 commented 3 years ago

There's been some good responses in here. I did noticed a couple years ago that the historical data didn't seem accurate. However, my issues was never that "the response contains no data". As @noah222 said, that sounds like rate limiting.

Beyond that, as @yiwensong said, the integrity of the data is more of a Coinbase issues, as our library is just wrapping the call. Closing for now.