dfuse-io / dfuse-eosio

dfuse for EOSIO
https://dfuse.io
Apache License 2.0
202 stars 45 forks source link

4-6s search transaction query time with http API #51

Open theblockstalk opened 4 years ago

theblockstalk commented 4 years ago

We are running dfuse in a docker container: https://github.com/Conscious-Cities/eosio-react-app/blob/a738d786d716f4ce9994794aa715b06ea79758fd/blockchain/Dockerfile-dfuse

When we request a transaction search using the https://docs.dfuse.io/reference/eosio/rest/search-transactions API the response typically takes 3-6s, higher than expected. This is on a fresh chain that has been running for only a 5-10 minutes.

In the container, nodeos produces blocks and runs state history plugin and state history api plugin dfuseos runs all services. I'm running this in a Virtualbox Ubuntu 18.04 machine with 9500Mb RAM and 6 processors. We are calling the dfuse API from a Nodejs docker using node-fetch.

What can we do to speed up the request?

theblockstalk commented 4 years ago

Perhaps it would be useful to know which dfuse services depend on each other, and then I can disable some of them.

maoueh commented 4 years ago

don't think there is much to disable, the two components making the grunt of the work are the search engine and the trxdb (library code).

The request enters through the api-proxy app that simply perform a reverse HTTP proxy to eosws app, eosws handles the HTTP request lightly and forward the call to the search-router app via a gRPC streaming call. The search engine only streaming matching transactions, eosws app then queries the database to retrieve the results.

How many results do you get in your response? The GraphQL version of Search Transactions is optimized to perform batch retrieval of transactions from the database, that could be something to try. I would start with this and check it out from there how it goes.

Something else also, we have in the search engine what we call a negative caching optimization. This feature requires a memcache instance to work. When configured properly, the search engine will record shard for which there was no results and save it into the memcache instance.

The second run of the exact same query will then only inspect shards that have results in it, avoiding loading from disk shards that are sure to contain no results. This is really useful because the irreversible shards cannot change, so let's say all your 6 shards contain no results, then the search engine will not load them from disk avoiding a lot of work. That's also something that could beneficial in your setup. Since you are using Docker, you could probably spin up a memcache instance and configure the search engine to us it.

This would not help for the first very call, but improves the dev experience when running all that in some "dev loop".

Another thing is to play with the shard size, for example shard of 500 blocks instead of 200 blocks. But this is harder to play with it because it has impact "reversible segment" of the search that needs to grow bigger (the live part syncs with the archive part, so if there is shard of 500 blocks, there must be 500 blocks in the live segment to "fill the gap" with the archive). We know by experience that a too big live segment is also not that good when the query performs in the "live" segment, but it's a tradeoff.

So trying with shard of 500 blocks and 1000 blocks could also be an interesting try.

theblockstalk commented 4 years ago

for this query there are 4 tx results

theblockstalk commented 4 years ago

sometimes this is taking 10+ seconds to respond. This is happening more when the blockchain has been left for 10+ minutes.

In the few minutes after the dfuse container starts a node for the first time, the response times are very short.

sduchesneau commented 4 years ago

do you see an improvement if you do the same query again ?

theblockstalk commented 4 years ago

I have turned of the state history plugin, and state history plugin API. These are the measured response times for a simple search query

Request Request and response are constand over all the response times below http://localhost:8081/v0/search/transactions?q=(auth%3Athenewfork%20OR%20receiver%3Athenewfork)&block_count=2147483647

Response

Response times when querying production dfuse node straight after booted 0 mins after boot - new chain 2.41s

1 min 0.225s 0.05s

2 min 0.02s 0.02s

3 mins 2.07s 0.238s

4 mins 0.023s 0.290s

5 mins 0.166 0.274

5 mins - If I make several calls right after each other 0.020s 0.17s 0.018s

Response times when querying production dfuse node more than 10 mins after boot This is a fresh chain (compared to results above) 12 mins after boot 7.92s (this was the first search query to the chain)

13 min 8.34s 0.432s 1.26s 2.55s

14 min 2.74s

15 mins 2.07s 0.238s

15 mins - If I make several calls right after each other 2.4s 8.91 3.40s