eprbell / dali-rp2

DaLI (Data Loader Interface) is a data loader and input generator for RP2 (https://pypi.org/project/rp2), the privacy-focused, free, open-source cryptocurrency tax calculator: DaLI removes the need to manually prepare RP2 input files. Just like RP2, DaLI is also free, open-source and it prioritizes user privacy.
https://pypi.org/project/dali-rp2/
Apache License 2.0
63 stars 42 forks source link

Some issues with various exchange price apis (CCXT) #95

Open topcoderasdf opened 1 year ago

topcoderasdf commented 1 year ago

This has nothing to do with Dali code base, but rather external apis that Dali depends on. Two issues that I have found have to do with Binance and Kraken APIs. I believe there could be more potential undiscovered issues.

Binance API for price fetching doesn't restrict its dates. Binance returns wrong pricing data for the dates prior to the exchange's existence (I believe Binance was launched in 2017) such as 2015, 2016, etc. So I think there needs to be a date restriction filter for each exchange that prevents such issues. You can run the below test code to see the issue.

from datetime import datetime
from dateutil import tz
from ccxt import (
    Exchange,
    binance,
    kraken
)

def test() -> None:
    _ONE_DAY: str = "1d"
    start_ms_timestamp: int = 1466649796000
    from_asset: str = "BTC"
    to_asset: str = "USDT"
    symbol: str = f"{from_asset}/{to_asset}"
    exchange: Exchange = binance()

    for i in range(0, 10):
        ms_timestamp: int = start_ms_timestamp + 864000000 * i
        timestamp: str = datetime.fromtimestamp(ms_timestamp / 1000.0, tz=tz.tzutc()).strftime("%Y-%m-%d %H:%M:%S.%f%z")
        print(timestamp)
        print(exchange.fetch_ohlcv(symbol, _ONE_DAY, ms_timestamp, 1))

test()

Kraken API returns wrong data for various currency pairs including USDT/USD. I have iterated through various dates and the ohlvc values stay the same including the trading volume.

from datetime import datetime
from dateutil import tz
from ccxt import (
    Exchange,
    binance,
    kraken
)

def test() -> None:
    _ONE_DAY: str = "1d"
    start_ms_timestamp: int = 1566649796000
    from_asset: str = "USDT"
    to_asset: str = "USD"
    symbol: str = f"{from_asset}/{to_asset}"
    exchange: Exchange = kraken()

    for i in range(0, 10):
        ms_timestamp: int = start_ms_timestamp + 864000000 * i
        timestamp: str = datetime.fromtimestamp(ms_timestamp / 1000.0, tz=tz.tzutc()).strftime("%Y-%m-%d %H:%M:%S.%f%z")
        print(timestamp)
        print(exchange.fetch_ohlcv(symbol, _ONE_DAY, ms_timestamp, 1))

test()

Current transaction resolver relies on Kraken to fetch USDT/USD pair and thus we probably need to look for an alternative. For USDT/USD pair I think we might need to consider using Coingecko API. I think CoinGecko integration into CCXT is still pending and it seems like it will take a quite a long time for it to actually get integrated.

Thus I think this project should create an independent CoinGecko API along with CCXT

CoinGecko Integration

macanudo527 commented 1 year ago

I think the Coingecko API can be integrated fairly easily as a backup. The only issue I can see is that Coingecko doesn't actually publish how it comes about its pricing right? So, it is not as verifiable in an audit. However, as long as users know that it shouldn't be an issue, especially if it is for limited pairs like USDT/USD.

topcoderasdf commented 1 year ago

Yes, while we don't know how CoinGecko comes up with its normalized price data, I think it is actually quite necessary, especially to resolve transactions prior to 2017. CoinGecko is the only free API that I am aware of that can provide accurate data prior to 2017. Binance currently provides wrong data prior to its launch date, while Kraken's data is unreliable. I also tried Coinbase, GateIO, Kucoin, etc, and all of them do not provide data for older dates.