tardis-dev / tardis-python

Python client for tardis.dev - historical tick-level cryptocurrency market data replay API.
https://tardis.dev
Mozilla Public License 2.0
113 stars 16 forks source link

Frequent ConnectionResetErrors and EOFErrors #7

Closed shahp98 closed 2 years ago

shahp98 commented 3 years ago

Hi. I'm trying to replay Deribit "trades" and "book" channel for all symbols active on '2021-05-31'. I am hit with ConnectionResetError or EOFError everytime I'm trying to do replay and write into a file. Moreover, these only come into picture when I'm replaying for all symbols. When doing it for only ETH-PERPETUAL and BTC_PERPETUAL, haven't faced this issue.

Here is the stack trace for EOFError: EOFError

Here is the stack trace for ConnectionResetError:

  1. aiohttp - ConnectionReset - 1
  2. aiohttp - ConnectionReset - 2

python==3.8.3 tardis-client==1.2.12

I'm assuming the EOFError comes when the caching is not done properly. Not sure about the ConnectionResetError. Can you suggest ways in which these can be avoided?

thaaddeus commented 3 years ago

Hi, I'm not sure, is there a chance you could share code snippet that is causing this issue? Then I'd be able to investigate Thanks!

shahp98 commented 3 years ago

Sorry for the delayed response: Here's a snippet of the code:

async def replay(from_date: str, to_date: str):
    tardis_client = TardisClient(api_key="API KEY HERE", cache_dir="/data/backups/tardis-cache-dir", http_timeout=360)
    messages = tardis_client.replay(
        exchange="deribit",
        from_date=from_date,
        to_date=to_date,
        filters=[
            Channel(name="trades", symbols=[]),
            Channel(name="book", symbols=[]),
        ],
    )
# Some file operations happen here
    async for local_timestamp, message in messages:
        f.write(f"{int(local_timestamp.timestamp() * 10**9)}," + json.dumps(message).replace(' ', '') + '\n')
    f.close()
thaaddeus commented 3 years ago

Can you try replacing file operations with async versions, for example https://github.com/Tinche/aiofiles ? It could be that writing to files is blocking Python's event loop that is used under the hood to get the data for replay and that is causing connection errors (as CPU is busy writing to file so data from network is not read, causing connection reset). Please also clear local .tardis-cache just in case.