Closed AshtonStephens closed 7 months ago
@wileyj
@AshtonStephens @wileyj GM. I can check this issue, as well, since I've worked on this event-replay implementation. Thanks!
Its not the end of the world that it takes 16Gb for the event replay, but it would be really nice if it didn't. I think you could have pandas output the files as it goes as opposed to in one fell swoop.
The work behind this event-replay implementation was to make it fast. So, to achieve that there is a tradeoff between computer resources usage and speed. Previous versions was taken days to finish.
Some improvements that were made:
txs
table was reduced. This will reduce the amount of parameters being passed to postgresql.To validate those changes, the file https://archive.hiro.so/testnet/stacks-blockchain-api/testnet-stacks-blockchain-api-latest.gz was used and the event-replay process has finished with success in a Apple M1 Max with 64GB of RAM.
The suggestions above will be taken into consideration in improvements to the event-replay process. Thanks.
@AshtonStephens @wileyj please, fee free to reach out if anything else is need.
Describe the bug Stacks API event replay procedure cannot complete.
To Reproduce Steps to reproduce the behavior:
Below is a script that does the majority of what I did, minus some initial installation. I did not check whether this script works but it's fully representative of what I did, including the environment. The only difference is I made a separate fork with two changes that I elaborate on below.
What you'll likely see:
One bug I found in this process is here: https://github.com/hirosystems/stacks-blockchain-api/blob/develop/src/event-replay/parquet-based/importers/new-block-importer.ts#L87, where the API should not be batching 1400 Txs. It looks like these each turn into more than 46 parameters to the SQL database, meaning that this number exceeds the maximum parameter count of
65534
.Below is the error, but changing the line I highlighted to 500 Fixes it for the time being.
There are some other issues as well, some with duckdb, but after upgrading to version
0.10.0
(latest) of duckdb that went away. Maybe that's a problem but the program progressed after I upgraded so I suspect it's fine.But once all the other errors went away we now get this error:
The program has 29Gb of Ram available to it on a 32Gb Ram machine. I could get a 64Gb machine going, but at this point I suspect there is something else is going wrong. It might make sense to include running the event replay procedure to some local testing steps so that this is easier to run when Nakamoto releases.
Expected behavior We should be able to run the API and ingest the event archive in the way listed in the docs.
Additional context
This is needed to run part of a potential Nakamoto debugging environment, and is the only current part of the network that is failing to start up. It would be great if we could get this fixed in the very near future.