Another attempt - Githubissues

opsecx commented 2 months ago

Daniel encouraged me to keep posting my issues. I note I haven't had much in way of reply for a month and tbh it's dragging me a bit.

I'm still trying to get undexer running properly under a fairly standard setup:

self-hosted postgres
specified database for undexer to use (as previously discussed)

when running newest branch against housefire-head, it starts well, syncs up to block approx 27k (there are approx 237k blocks atm), then crashes.

When I query the db through undexer cli it has all relevant information it seems up to the approx 27k block.

When I restart the indexer however, it does what it has done for me all along, namely wipe the db and start from scratch.

So there are two issues:

it crashes somewhere along the indexing
(more seriously) restarting the indexer wipes the tables and restarts sync from scratch.

Would be great if these issues could be fixed.

egasimus commented 2 months ago

Hey, just seeing this now. Thank you for staying in touch!

The crashes are known to us and seem to be due to some intermittent race condition within the node, which we have no control over; we paper over those by restarting the indexer when it crashes. I agree this might not be ideal, but in practice it seems to work well enough.

Unfortunately we haven't been able to reproduce your main problem, where it wipes the database. Currently we're working on a setup with a self-hosted full node (turns out some data is erased after a few epochs, so the only way to fetch it retroactively is while a fresh node is syncing... yeah go figure :grimacing: https://github.com/anoma/namada/issues/3810).

That has made it necessary to rework some of the indexing logic, plus - hopefully relevant to your case - updating the Docker Compose configuration. Maybe that'll be where we catch the table wipe happening; and if we don't, I wouldn't have much else to advise you besides giving the Docker Compose setup a shot, once the next version is out (this week or worst case early next.)

egasimus commented 2 months ago

Hmm, something just crossed my mind: what happens if you keep the empty tables, but comment out this db.sync call[^0]? Does it still delete stuff?

I always had this nagging feeling that the Sequelize sync method actually doesn't always work as expected. Maybe that's what's happening differently for you? Since we've relied on reindexing from scratch a whole lot, instead of doing proper migrations (sorry, node operators!), we haven't really had the opportunity to get to the bottom of that - should've just used slonik or something... :grin:

[^0]: Edited 2024-09-24: Update link to permalink.

opsecx commented 2 months ago

I'm feeling it would be great if you could try and replicate a similar setup to mine, it's a fairly standard config (postgres 14, single database given in connection url), and see if you get the same errors in functionality. it's a little hard for me to debug from here not being into the inner workings of the program source.

egasimus commented 2 months ago

That's exactly why we provide docker-compose.yml: to have an actually standard setup (not simply a fairly standard one)—which would allow you to treat the application as a black box without needing to peer into its inner workings.

Anything else is entirely in your hands.

We provide Undexer to the community free of charge. One of the implicit contracts of open source software is that users are able to contribute improvements, and lend a hand in solving problems. It is actively harmful to understand this as an invitation to request free labor, and I'm sure by now you have heard many people speaking out about how this endangers the health of the ecosystem.

Still, if you would be so kind as to provide a virtual machine image in an open format, containing the system in which the bug can be replicated, it would make it possible for us to look into your problem in situ—sometime inbetween making Undexer index vote power correctly (massively complicated by how the Namada node prunes data) and the rest of the improvements requested by the funder of this project.

Alternatively, I've already provided an exact pointer to a single line in the source code—which you can comment out after initial DB sync, to see if this is what deletes the database on crash. (Since we provide no pre-packaged build artifact, I'm allowing myself to assume here that you're running from source?)

And the inner workings of sequelize (which does our DB setup, and where I suspect the root cause of your problem to be) are as much a mystery to us as they are to you—which is why I talk of wanting to replace it. :sweat_smile:

opsecx commented 1 month ago

would it be helpful if I simply replicate the issue on an isolated vps I give you access to? then you can see it hands on? (will probably take me a couple of days to set up)

opsecx commented 1 month ago

As for the rest of the comments, I don't know how much time I have spent trying to fix this alongside you guys, so that feels unfair tbh.

opsecx commented 1 month ago

Ok so I just figured out what's causing this thing with resetting the db. I've been carrying over the .env file, where the original (from former versions) had a value start_from_scratch set to true. sorry about that. Now it no longer starts from scratch.

opsecx commented 1 month ago

I'll let it run through the blockchain of current housefire, and see how it does.

mradkov commented 1 month ago

closing due inactivity. please file a new issue if problems still persist.

hackbg / undexer

Another attempt #9