openethereum / parity-ethereum

The fast, light, and robust client for Ethereum-like networks.
Other
6.83k stars 1.69k forks source link

eth_getLogs slow or even crash #11663

Open tonik-ru opened 4 years ago

tonik-ru commented 4 years ago

Your issue description goes here below. Try to include actual vs. expected behavior and steps to reproduce the issue.

this call takes ~70 seconds sometimes it crashes parity process. in console i see "killed"

expected behavior no crash. fast response

steps to reproduce curl -X POST --data '{"id":1,"jsonrpc":"2.0","method":"eth_getLogs","params":[{"fromBlock":"0x3e8","toBlock":"latest","topics":["0xddf252ad1be2c89b69c2b068fc378daa952ba7f163c4a11628f55a4df523b3ef",null,["0x00000000000000000000000031a0a43978171be41ea7c5d60b0a3afb475fbb8a"]]}]}' -H "Content-Type: application/json" http://127.0.0.1:8545

adria0 commented 4 years ago

Hi @tonik-ru, usually the killed output in console is due to an out of memory of the host OS, could you check it, please? On the other hand, openethereum (formerly parity) database is very optimized, but these this kind of queries, scanning all the available blocks are sometimes heavy. Is it possible for you to split the query paginating the blocks? (you have fromBlock and toBlock for it)

tonik-ru commented 4 years ago

of course i can workaround splitting into small queries. but WHY? this is a simple query: "filter by topic" (removing contract address doesnt change anything) . output is small, only few records. the same call is instant on infura. even if i split into small parts (10K blocks), this doesnt solve the problem, only workarounds. and i still have to scan ALL blocks to get only few records. instead of a single instant query.

you can easily reproduce the call and check that this is not what we expect. even 10K blocks queries load cpu greatly. im testing on 32 cores server. and this query is terribly slow.

adria0 commented 4 years ago

The reason that the same call is instant on infura is, AFAIK, because they dump all the contents of the ethereum clients into a Redis (or alike), and you are not querying an ethereum node that is very busy validating and propagating blocks, you are querying a Redis database ultra-optimized for queries like yours.

tonik-ru commented 4 years ago

redis? i dont think so. so your "official" position: we can run bloom queries with filter by topic ONLY on small blocks range? and this causes db scan. right? there is nothing in docs about this limitation.

adria0 commented 4 years ago

At this moment the query runs OK, but takes some time, 45 seconds in my server (8code 16 GB ram), so, is possible to run it.

There's no "official position" @tonik-ru, openethereum is maintained by its core developers so it's open to new devs and contributors, so there is no problem if you want to join the team and be one of them. Please, feel free to optimize the database and the documentation as much as you want.

tonik-ru commented 4 years ago

45 seconds in production, assuming i need this to be done ~1000 times.... this means it doesnt work :) ok, i got you. thank you. ill chek if geth will give me different results (just curiosity, because im bound to parity)

adria0 commented 4 years ago

Knowing the geth performance will be really interesting, it will be great if you can share the result.