ethereum / web3.py

A python interface for interacting with the Ethereum blockchain and ecosystem.
http://web3py.readthedocs.io
MIT License
4.91k stars 1.68k forks source link

Why event_filter.get_new_entries() miss pending transactions? #3365

Closed fridary closed 3 months ago

fridary commented 3 months ago

What happened?

I am testing websockets and get_new_entries() and trying to understand which way works:

Here is code:

from web3 import Web3
import asyncio

web3 = Web3(Web3.HTTPProvider('http://127.0.0.1:8545/'))

async def log_loop(event_filter, poll_interval):
    while True:
        gne = event_filter.get_new_entries()
        for ev in gne:
            print(f"len(gne)={len(gne)}", Web3.to_json(ev))
        await asyncio.sleep(poll_interval)

if __name__ == "__main__":
    tx_filter = web3.eth.filter('pending')
    loop = asyncio.get_event_loop()
    try:
        poll_seconds = 0.1
        # poll_seconds = 3
        loop.run_until_complete(
            asyncio.gather(
                log_loop(tx_filter, poll_seconds)
            )
        )
    finally:
        loop.close()

So, let's gather pending transactions. If I set poll_seconds=0.1, I will get output like this:

len(gne)=1 "0x87efc41298abe9c6d92158a2e4e1064e1d1a80b5da0e3ba0dd21093537026896"
len(gne)=3 "0x5503d9e538d55fe117cf6554bace84f9c6ce153fb8c56bf3433dcf0cf542fce4"
len(gne)=3 "0xaacd15b0dd76b2a763b27ae2fe9d68aa223749090bf67ec89fdaac80e3aa839c"
len(gne)=3 "0xee7b1665bfc3a94e4a2fb3caab79fe45b674b5b80ecab3f86e7b7358246b804d"
len(gne)=1 "0xe132db2aa50f815c673bb4d5d373bd6bbd5fa3847910465387fb3da4c0c8bd3e"
len(gne)=1 "0x69c17e6d0658f3598ab7786634bb736aa4576c6c459ab4cec6bc684400f7aebf"
len(gne)=2 "0x8c96c082a20506c836cc3cb127c6129ce272b4c3f72454c0a3b7cc4b9b9bbc91"

the speed will be more or less fast. If I understand right, in len(gne) we get transactions number that appeared from the last request to node for pending transactions. I'm getting this number not more than 5.

Now, let's increase poll_seconds=3. Each new line will be appeared obviously every 3 seconds and slow speed:

len(gne)=1 "0xe30c996fd3260da71b240706e28ada46941adb0b3e40d6871107290f6c305881"
len(gne)=3 "0x697f55f6325cb6275e3ba0f475cd18f1bc7d1a42c9cdab14cb3026fa04fc94f8"
len(gne)=3 "0x8b357168205096dec3dab30292c439a242228e0390783ddbfddd1033f88442be"
len(gne)=3 "0xc5732a84046c373c4c780886d95f912996f3c40be1134fa3dc0932e915deedfb"
len(gne)=2 "0x30606354ef3275ffd5377b16759e0d6a6fca300bddb6178d9fa812d80cea4f85"
len(gne)=2 "0x95292b8120a60ebd9b6505d51162e6add6c6177472952f47d6bac0f2d3457987"
len(gne)=1 "0x2bc87295750ede653c546f7b301fca0596a8c61e042d58893a0568450920da8e"
len(gne)=2 "0x7928ba27120ec07333e132c5cd5f7c0e29a6f1613775cd770cd7b1ef4b98371d"
len(gne)=2 "0xbdbcbbc945394e293de482028f9c176eb0b625b2a49289c3c068251633113b67"
len(gne)=3 "0x32e926084d6df534652cfbb397359af38217c77d4197265397af27fa60c6c62e"
len(gne)=3 "0x34b0c5a1d5442f34875113ec7cf21c724dd482ef36816dd6980287a2335b9a28"

We can see, that len(gne) remains the same! It does not collect more transactions in for loop. How that possible? That means that we simply losing and missing a lot of transactions. What if we also lose transactions with 0.1 interval?

Let's do experiment. Here is Uniswap v2 Router https://etherscan.io/address/0x7a250d5630B4cF539739dF2C5dAcb4c659F2488D It's around every 30 seconds new operation. I rewrite log_loop() function to catch Uniswaps transactions and set poll_interval=0.1:

async def log_loop(event_filter, poll_interval):
    while True:
        gne = event_filter.get_new_entries()
        for ev in gne:
            tx = web3.eth.get_transaction(Web3.to_json(ev).replace('"',''))
            print(f"len(gne)={len(gne)}", tx['to'])
            UNISWAPV2ROUTER = "0x7a250d5630B4cF539739dF2C5dAcb4c659F2488D"
            if web3.to_checksum_address(tx['to']) in [UNISWAPV2ROUTER]:
                exit('Found')
        await asyncio.sleep(poll_interval)

Result: I was waiting at least 2 minutes to catch first Uniswap transaction, although on etherscan I saw some. That means even with short polling I miss transactions. How to fix it? Why it happens?

I use Erigon local node with powerful CPU, a lot of RAM, high Internet connection and fast NVMe disk. If I check latest block on my node and etherscan - it equals.

Code that produced the error

No response

Full error output

No response

Fill this section in if you know how this could or should be fixed

No response

web3 Version

6.16.0

Python Version

3.11.7

Operating System

Ubuntu 20.04.6 LTS

Output from pip freeze

aiohttp==3.9.3
aiosignal==1.3.1
alchemy-sdk-py==0.2.0
annotated-types==0.6.0
anyio==4.3.0
attributedict==0.3.0
attrs==23.2.0
bitarray==2.9.2
blessings==1.7
cachetools==5.3.3
certifi==2024.2.2
chardet==5.2.0
charset-normalizer==3.3.2
codecov==2.1.13
colorama==0.4.6
coloredlogs==15.0.1
colour-runner==0.1.1
coverage==7.4.4
cytoolz==0.12.3
deepdiff==6.7.1
dill==0.3.8
distlib==0.3.8
distro==1.9.0
eth-abi==4.2.1
eth-account==0.11.0
eth-hash==0.7.0
eth-keyfile==0.8.0
eth-keys==0.5.0
eth-rlp==1.0.1
eth-typing==4.0.0
eth-utils==4.0.0
filelock==3.13.3
frozendict==2.3.10
frozenlist==1.4.1
h11==0.14.0
hexbytes==0.3.1
httpcore==1.0.5
httpx==0.27.0
humanfriendly==10.0
idna==3.6
inspecta==0.1.3
jsonschema==4.21.1
jsonschema-specifications==2023.12.1
lru-dict==1.2.0
markdown-it-py==3.0.0
mdurl==0.1.2
moralis==0.1.45
multidict==6.0.5
multiprocess==0.70.16
numpy==1.26.4
openai==1.23.4
ordered-set==4.1.0
packaging==24.0
pandas==2.2.1
parsimonious==0.9.0
platformdirs==4.2.0
pluggy==1.4.0
protobuf==5.26.1
psycopg2==2.9.9
py4j==0.10.9.7
pycryptodome==3.20.0
pydantic==2.7.1
pydantic_core==2.18.2
Pygments==2.17.2
pyproject-api==1.6.1
pyspark==3.5.1
python-dateutil==2.8.2
python-dotenv==1.0.1
pytz==2024.1
pyunormalize==15.1.0
referencing==0.34.0
regex==2023.12.25
requests==2.31.0
rich==13.7.1
rlp==4.0.0
rootpath==0.1.1
rpds-py==0.18.0
six==1.16.0
sniffio==1.3.1
tabulate==0.9.0
termcolor==2.4.0
toolz==0.12.1
tox==4.14.2
tqdm==4.66.2
typing_extensions==4.11.0
tzdata==2024.1
urllib3==1.26.18
virtualenv==20.25.1
web3==6.16.0
web3-input-decoder==0.1.11
websockets==12.0
yarl==1.9.4
fselmo commented 3 months ago

We prefer to use issues to track our work. If you have implementation or usage questions, please refer to our documentation and/or check out the Ethereum Python community on discord.

If you find a bug somewhere, please be more specific and either ask to re-open this or raise a more targeted issue so we can track it. Best of luck.