Closed greg7mdp closed 1 year ago
This works fine in nodeos 3.1. So regression here.
http-max-response-time-ms = 2500
works in 3.2
Seems to be a deliberate but poorly documented change in https://github.com/AntelopeIO/leap/pull/96
Seems to be a deliberate but poorly documented change in #96
Probably. Although the PR #96 doesn't list get_block
as one of the affected APIs?
I don't know why we would want to have limits on get_block. If the block is stored and requested we want to send it.
My (limited) understanding is that if the node is used for block production, we may not want it failing to do so because its main thread is busy serving long http requests. If the node is used mainly for serving data requests, yes probably these limits should be relaxed.
block production nodes should never be serving general http requests anyway.
I don't know why we would want to have limits on get_block. If the block is stored and requested we want to send it.
So just set a very high limit on the api node. Or do you want to avoid long running get_table
calls?
In an ENF backlog refinement session, we discussed this issue at length, but didn’t come to a clear consensus on the best path forward. We did all agree that the current behavior is undesirable and that more comprehensive thought is required around api architecture and where serialization should be handled. For now, we will simply communicate a workaround.
The workaround: Ensure that max-http-response-time is configured to be greater than max-abi-serialization-time to prevent timeouts on GetBlock. To keep a lower maximum allowed http response time for endpoints other than GetBlock, run multiple nodeos instances with the required configurations, and use a proxy to direct to them as desired.
A more complete statement with context: API endpoints often require ABI serialization. For large blocks requested with the GetBlock endpoint, this serialization can take quite a long time. To minimize the occurrence of requests that require a long running serialization failing due to exceeding the maximum allowed http response time our suggested workaround is to ensure that max-http-response-time is configured to be greater than max-abi-serialization-time.
If a node operator would like to keep a lower maximum allowed http response times for endpoints that don’t have this problem, one option is to use a proxy to direct GetBlock requests to a dedicated nodeos instance with a higher max-http-response-time, and all other requests to a nodeos instance with your preferred max-http-response-time.
Time for the abi serializer to_variant vs the conversion to json for EOS mainnet block 291953778 on my i9-12900K
to_variant calls: 1, total: 884855us, avg: 884855us
to json calls: 1, total: 48551us, avg: 48551us
Well, I guess it shows that the issue, if there is one, is in to_variant
!
Well, I guess it shows that the issue, if there is one, is in
to_variant
!
Yeah, that is as expected. Although that to json is rather terrible as well.
For reference the same block without exploding it via abi_serializer (what a /v1/chain/get_raw_block
would be):
info 2023-02-07T00:40:21.876 nodeos stopwatch.hpp:41 report ] get_block calls: 1, total: 1460us, avg: 1460us
info 2023-02-07T00:40:21.878 http-0 stopwatch.hpp:41 report ] convert to variant calls: 1, total: 1692us, avg: 1692us
info 2023-02-07T00:40:21.883 http-1 stopwatch.hpp:41 report ] to json calls: 1, total: 4546us, avg: 4546us
Move most of the abi_serialization off the main thread:
info 2023-02-07T01:12:34.801 nodeos stopwatch.hpp:41 report ] get_block calls: 1, total: 1817us, avg: 1817us
info 2023-02-07T01:12:34.802 nodeos stopwatch.hpp:41 report ] gather abis calls: 1, total: 742us, avg: 742us
info 2023-02-07T01:12:34.987 http-0 stopwatch.hpp:41 report ] abi_serializer calls: 1, total: 184769us, avg: 184769us
info 2023-02-07T01:12:35.036 http-1 stopwatch.hpp:41 report ] to json calls: 1, total: 46043us, avg: 46043us
Wow, did that speed up the abi serialization a lot? Above you reported that to_variant
took 884855us
.
Wow, did that speed up the abi serialization a lot? Above you reported that
to_variant
took884855us
.
Yes. This uses an abi cache so that an abi is only pulled out of chainbase and an abi_serializer created once per account. In the existing implementation that is done over and over again each time the account is referenced in the block.
OK, that makes more sense, looks like an great speedup, over 4x faster. So the speedup is because of the cache, not because it is done on a different thread.
I'll have to check out the cache you implemented. Is it a lru cache? And does it need to support concurrent access from multiple threads?
OK, that makes more sense, looks like an great speedup, over 4x faster. So the speedup is because of the cache, not because it is done on a different thread.
I believe so, yes.
I'll have to check out the cache you implemented. Is it a lru cache? And does it need to support concurrent access from multiple threads?
Nothing that complicated is used. The cache only lives for the life of the get_block call. Since the gather abi which creates the cache is less than 1ms in this example; I think that is fine as a final solution.
Should we consider this closed by #696 ?
Should we consider this closed by #696 ?
Yes, I think so, but maybe @matthewdarwin thinks otherwise? With your cache the request is is significantly faster so that should lessen the problem.
Also, I was wondering if you checked whether avoiding the extra string copies did speed up the abi_serializer
part at all?
Also, I was wondering if you checked whether avoiding the extra string copies did speed up the
abi_serializer
part at all?
Yes, on my machine for the test block cut it in half.
Wow, that's great. So for that block, the total request processing time is about 8x faster than before your changes?
The abi_serializer
creation was only a small part of that.
info 2023-02-07T01:12:34.802 nodeos stopwatch.hpp:41 report ] gather abis calls: 1, total: 742us, avg: 742us
Now that is ~350us on my machine.
For having a good idea of the progress made in the PR, I think it would be very helpful if you provided the updated times for the following two measurements:
to_variant calls: 1, total: 884855us, avg: 884855us
to json calls: 1, total: 48551us, avg: 48551us
The
abi_serializer
creation was only a small part of that.
This must mean that the cache saves us a lot of duplicate abi_serializer
creations. And also that having a persistent cache as we discussed probably does not provide significant benefits (since creating the cache on-demand, even for this big block, costs little - 350us)
Current PR (https://github.com/AntelopeIO/leap/pull/696/commits/34baed929b36c13c36db3cb1c55b06d87f785f2d) perf:
info 2023-02-08T16:33:11.344 nodeos stopwatch.hpp:50 report ] get_block calls: 1, total: 1351us, avg: 1351us
info 2023-02-08T16:33:11.345 nodeos stopwatch.hpp:50 report ] create cache calls: 1, total: 340us, avg: 340us
info 2023-02-08T16:33:11.533 http-1 stopwatch.hpp:50 report ] to variant calls: 1, total: 188137us, avg: 188137us
info 2023-02-08T16:33:11.577 http-0 stopwatch.hpp:50 report ] to json calls: 1, total: 40403us, avg: 40403us
to_variant
before would include create cache
& to variant
.
Nothing in the PR should make any difference in to json
, so any time diff there is not real.
The exciting thing about this change is how much less time is spent on the main thread.
Great, so we are 5x faster on to_variant, 4x faster overall (including to_json). And almost all off the main thread. That is a great PR!
in eosq block explorer.
The message mentions
json.hpp:63
, which is in astring to_string()
function. Maybe we have an issue where creating json output takes excessively long for large trees?