Closed edmundnoble closed 10 months ago
@edmundnoble Thank you very much for opening this PR! I've been waiting for the corresponding CW-N endpoints to go into a release and now that they are I'll look into updating, verifying and merging this PR.
@edmundnoble, I've used a CW-D instance from this branch to backfill the database over night. Today I've noticed that it currently leaves a lot of gaps of blocks ~10-20 heights tall. Seems like this is caused by the fact that we're trying to fetch chunks of 360 blocks by asking for maxheight
- minheight
= 360 ranges, but CW-N also seems to limit the block
endpoint response at exactly 360 items max. The end result is that in the presence of orphans, CW-N sends multiple blocks for some heights, fills up the 360-item limit short of fulfilling the range we asked for. Is this the intended behavior for this endpoint?
I'm planning to fix the issue with CW-D by shrinking our chunks, chunks of 360 is unnecessarily large anyway and I think it also inflates the memory consumption unnecessarily.
Unfortunately, there is no paginated endpoint which includes orphans, which would be the natural way to deal with this issue. You can sort of get around this by asking for maxheight - minheight = 120, because there are almost never more than 2 orphans at a given height.
I also opted for a chunk size of 120, which solved the problem of gaps as expected. I'll leave this fill running for a little more and look for any issues, otherwise it seems like this PR is very close to completion.
I think this PR is ready to be merged. I've been running a chainweb-data
process from this PR in server
mode as well as in fill
mode for the last 24 hours or so against an (initially) empty Postgres SCHEMA
inside a database that has a chainweb-data instance from the last release populating another SCHEMA
for comparing the populated data.
During those 24 hours, chainweb-data
managed to backfill roughly 50% of current mainnet. The machine I ran this on has other services running on it, so it's hard to measure fill
performance differences, but the populated data doesn't seem to have any issues based on the following checks:
I've run the following query to check if there are any gaps in the populated data:
SELECT chainid, height, next_height
FROM (
SELECT
chainid,
height,
LEAD(height) OVER (PARTITION BY chainid ORDER BY height DESC) AS next_height
FROM
blocks
) AS subquery
WHERE
(height - next_height) > 1
ORDER BY
chainid ASC,
height DESC LIMIT 20;
chainid | height | next_height
---------+--------+-------------
(0 rows)
(Essentially the same query performed by chainweb-data
for discovering the gaps in the data)
SELECT count(*)
, count(*) FILTER (WHERE (SELECT to_jsonb(b) = to_jsonb(b2) FROM cwd.blocks b2 WHERE b2.hash = b.hash))
, count(*) FILTER (WHERE NOT (SELECT to_jsonb(b) = to_jsonb(b2) FROM cwd.blocks b2 WHERE b2.hash = b.hash))
FROM blocks b
;
count | count | count
----------+----------+-------
34508643 | 34498688 | 0
(1 row)
Seems like we've ended up with some additional blocks (I think this new endpoint provides all orphans seen), but all the blocks we've copied that are common with the release CW-D have all of their fields identical.
SELECT count(*)
, count(*) FILTER (WHERE (SELECT to_jsonb(t) = to_jsonb(t2) FROM cwd.transactions t2 WHERE t2.block = t.block AND t2.requestkey = t.requestkey))
, count(*) FILTER (WHERE NOT (SELECT to_jsonb(t) = to_jsonb(t2) FROM cwd.transactions t2 WHERE t2.block = t.block AND t2.requestkey = t.requestkey))
FROM transactions t
WHERE t.height > 4100000
;
count | count | count
----------+----------+-------
22884000 | 22825320 | 0
(1 row)
Similar query for a subset of the transactions, no discrepancy found.
The query was a little trickier for the events table, because that table contains the paramtext
column, which is rendered from JSON to string on the Haskell side, and the field order changed between versions. That's why I had to remove that from the json representation we use for comparison in order to avoid false negatives:
SELECT count(*)
, count(*) FILTER (WHERE (SELECT to_jsonb(e) - 'paramtext' = to_jsonb(e2) - 'paramtext' FROM cwd.events e2 WHERE e2.block = e.block AND e2.requestkey = e.requestkey AND e2.idx = e.idx))
, count(*) FILTER (WHERE NOT (SELECT to_jsonb(e) - 'paramtext' = to_jsonb(e2) - 'paramtext' FROM cwd.events e2 WHERE e2.block = e.block AND e2.requestkey = e.requestkey AND e2.idx = e.idx))
FROM events e
WHERE e.height > 4300000
;
count | count | count
---------+---------+-------
7695345 | 7674529 | 0
(1 row)
This PR uses
chainweb-node
's new block headers with payloads endpoint during the history fill operations. Doing so simplifies the code and presumably reduces the CPU overhead over the CW-N and CW-D involved in the process.The next step is to migrate the listen process from the header stream to the header with payloads stream, but it is not included in this PR.