Open patrick300100 opened 1 year ago
Hi @patrick300100,
First, thank you for digging into this issue and creating this very detailed write-up. Could you please respond with the exact function call (i.e., exact arguments) so I can reproduce it?
I'm guessing that the zero entries may be due to trading halts that have happened with Solana in the past, see this news article, for example. We'd have to go back and look if the dates align. During that period, there may be no price data because no trades were fulfilled.
To be honest with you, I'm not doing much work on this library these days anymore. This was one of the first Python libraries I wrote and there's a lot of room for improvement. If you feel like trying to fix it yourself and opening a PR, I'll take a look. Other than that, I probably can't give you an estimate on when this will be fixed.
Yeah sure! The function call that I used is get_historical_price_hour_from(coin='SOL', currency='EUR', fromTs=datetime(2010, 1, 1), toTs=datetime(2023, 7, 1))
.
I looked at the dates whether the Solana network outages described in your news article align with the zero entries. That's not the case. The Unix timestamp 1587780000 (one of the zero entries) corresponds to April 25th 2020. The news article you sent talks about outages at January 2022, December 2021, and September 2021.
It's also not only a thing that happens at Solana, but for multiple coins. Take for example ADA (Cardano) and look at entry 242:
Or Axie Infinity (AXS) at entries 415 and 416:
And for Render (RNDR) it's even messier:
It's not for all coins though! But definitely for some as you can see above. It's not only a Solana-thing.
It could be very much the case that something happened with each coin's network and hence having no price information at times, but actually that doesn't matter. The fact is that sometimes there is no price information and that causes the code to terminate the while-loop too early.
If you're now convinced that this is a bug, I'll open a PR soon enough!
I do wonder though, have you tried other exchanges and if yes, do the timestamps of zero entries align between them? Maybe the exchange halted trading even though the coin was still online.
I just tried it out and it indeed seems that it is exchange-related. I took AXS and at the CCCAGG exchange it just gives a price
But at the Coinbase exchange it doesn't:
Hm, interesting, it may not be a bug after all. Now I'm thinking it coincides with exchange downtime.
That could definitely be a thing, but even if there was a temporary exchange downtime, do you mean that if I try to get price history from January 1st to January 31st and there is exchange downtime on January 15, that its correct behaviour is to only get price history from January 15th to January 31st even though there is more price history from January 1st to January 14th?
*Just for the ease of explaining I took the downtime of a whole day
No, you're right that it should still return the whole time frame. I initially assumed that the fact that it includes a zero value by itself is already proof that there's a bug.
To sum up, I think the returned zero values are legit (for better or worse), but the library should definitely return the entire time frame like you described.
I do remember there was some weirdness with limit
though, so maybe that's another factor. It's been a while since I touched this library.
Yeah limit
also impacts this behaviour. Take for example again the example I described earlier of getting price history between January 1st and January 31st with downtime on January 15th.
If the limit for example is 6 days, it would retrieve data from January 26th until January 31st (26, 27, 28, 29, 30, 31), then it would again get 6 days, in which it would get data from January 20th until January 25th (20, 21, 22, 23, 24, 25). And then it would again get data for 6 days from January 14th until January 19th (14, 15, 16, 17, 18, 19). Here we would break out of the while loop because January 15th would have zero entries because of the downtime, so it would filter those out, your code detects that at if len(validHist) < len(p): break
and it would return an array with data from January 14th 'till 31st excluding the 15th.
If we put the limit to 10 days, we would first get data from January 22nd until January 31st. Then we would again get data for 10 days from January 12th until January 21st. Now we would break out of the while loop in this iteration, removing the zero entries that occurred at January 15th, and then return the full array with price history from January 12th until January 31st excluding the 15th.
I don't know whether that is what you meant, but limit
does also impact this behaviour. That's why I proposed a fix to not stop terminating when we remove at least 1 zero entry, but keep fetching until all the entries returned by the cryptocompare API are zero entries. I fixed this behaviour locally with just a single line-change from if len(validHist) < len(p): break
to if len(validHist) == 0: break
. Then we stop if there is no valid history anymore instead of stopping when we removed at least 1 zero entry.
If you don't want to have the library changed because you're not convinced of my issue that is of course also fine :). I understand it can be a bit challenging / hard to dive into your code / topics / concepts from years ago especially if you haven't touched the whole code base in a long time
No, I think you're absolutely correct on this. Actual unit tests would also help to catch issues like this, but like I said, this was my first library and hasn't been touched in a while. The code in question was also contributed by someone else, which makes it even more difficult to follow.
You're more than welcome to suggest the changes in a PR. I'd love to fix this but I'm not sure when I'll have time.
I think I found a bug that occurs in
get_historical_price_hour_from
(and I think inget_historical_price_day_from
as well, but I only usedget_historical_price_hour_from
so I will be talking about this function).So basically if I understand your code correctly, you keep fetching the price history until we are at
fromTs
, but basically if price history is not available (because we go beyond the time where price history is stored) it returns only zeroes:{'time': xxxxxxx, 'high': 0, 'low': 0, 'open': 0, 'volumefrom': 0, 'volumeto': 0, 'close': 0, 'conversionType': 'direct', 'conversionSymbol': ''}
and you remove those from the list invalidHist = [elem for elem in p if elem["time"] >= fromTs_i and elem["open"] != 0 and elem["close"] != 0]
. So for example I would get this response if I try to get the price history of Ethereum on January 1st 2000. I would get only zeroes because Ethereum didn't exist back then.And I think you also assumed that. Because a few lines later at
if len(validHist) < len(p): break
, you check if we removed some zero entries (andelem["time"] >= fromTs_i
but that is not important for this story) and if we did, you assume that we reached the end of valid price histories and we break out of the while loop. However, when fetching the price history of Solana (but it applies to other coins as well), I got this:For whatever reason, the price history sometimes has zero entries in it, even though we haven't reached the end yet! In my example, I was fetching from January 1st 2010 (just a lower bound to indicate that I want all the possible price history available) until July 1st 2023. But it prematurely terminated in this loop, even though there is more price history to fetch!
A possible fix could be to not terminate when we removed at least 1 entry, but to keep going until all 2000 (or to how much
limit
is set) entries are zero entries. In that case you are absolutely sure that you reached the very beginning of a coin's price history.