Open mindrunner opened 3 months ago
Yes, health checks and metrics are on the roadmap and would be great to have. I've coordinated some changes to the Metaplex DAS that'll allow us to utilize its backfilling module. The changes are in review at Metaplex's end, once they're finalized I'll add them to LightDAS.
Your changes give LightDAS a CLI-based approach which is fine, although since we use Metaplex DAS' docker containers, from a deployment standpoint, I'll dockerize LightDAS as well for uniformity and convenience. Please make a note of that while working on your proposed changes.
@KartikSoneji can comment more on the commands you want to add
The work I am currently doing is only a PoC, so I haven't cleaned everything yet. Currently I bake lightdas into my metaplex image to deploy the api as well as lightdas. I am currently testing the setup with lightdas split up into 4 different services. Will report back / create PR as soon as I am successful.
Turned out, that (as long as we do use an internal queue), we rely on having transactions
(the queue processor) active for both backfill
as well as subscribe
.
I kinda seem to rely on backfill running regularly. Had some issues with the subscribe (see https://github.com/WilfredAlmeida/LightDAS/issues/10). Am I missing something here?
We're working on separating the backfill and ingestion. Earlier we used a single queue for all txs which would get processed after all transactions were fetched. This is slow and not scaleable so it's being redone. Separation of the backfill and ingestion gives more options to the users which we've come to realize might be good
Hi @mindrunner, I've merged some breaking changes into the main branch. If you're working on this, please take a pull.
Also, keep an eye on the readme for an explanation of how the new changes work.
Thanks for letting me know! I have that on my plate already! Hope I can find some time next week.
I finally managed to upgrade to the latest code. The previous implementation was unusable at the scale we are using it. I just deployed (as a single instance) the new code and it looks very promising. Backfilling seems to be faster and the load on the database is reduced by a huge factor. I probably need to wait some hours/days now until we have some proper usability result with this.
I wonder if there is some ways to tweak/optimize the performance. I am currently running the process on 8cpu / 32Gi using 15% CPU, 40% mem while backfilling. The postgresdb is 8cpu/16Gi, maxing out the RAM no matter how much I put in. I saw BubblegumBackfillArgs
. That seems like a good start to play with the numbers, right?
I am currently working on separating the different app-subsystems from operational perspective:
I added
clap
for command line parsing and made everything optional. I also added a simple http-server which I need as a health check in my deployments. Is there any interest into getting this merged here?