Closed Xunnamius closed 8 months ago
Update: said fix solved the issue and the database now rebuilds successfully. Maybe I try my hand at speeding up the rebuild process next :)
Hi, thanks for your investigation.
However I can't accept this fix for now. This shouldn't happen and this fix only works around some other underlying issue. What applyTagSiblings
does is it copies post lookup tables from better/preferred siblings to other siblings so they can be used for searching. The other siblings shouldn't have any lookup tables by this point because normalizeTagRelations
is supposed to move all mappings to better siblings, leaving worse ones "empty". Maybe there is a possibility that a tag could have two better siblings, which I thought isn't allowed by hydrus but all kinds of shit happens there so it's better to improve hybooru tolerance to hydrus fuckery.
I will have to investigate this issue. It is possible that we might need to drop double-better siblings situations, or arguably, better, join lookup tables together on conflict. There is also possibility that the loop resolution in during tag normalization has some bug in it.
Could you try uploading your database(without images) for sake of testing? It could greatly help me debug this issue.
Ah, gotcha. I'm not at liberty to upload the hydrus db (it's also ~100gb). But if Hybooru has a verbose debug mode or crash dump / log-generating option, I'm more than willing to rebuild the database on another drive and send you the outputs.
Does that 100GB includes images or not?
As of hybooru, there is really nothing to log here, it just runs an SQL query and something goes wrong in the database. You can try to make a minimal reproduction for this bug or upload just the part of your database that is causing that issue, ie without caches, images and stuff. Otherwise, there is little I can do without any database to debug on.
If you need a place to upload to I can give you an account on my server so you can just scp/rsync/ftp into it.
That 100GB (~89GB) database does not include images or thumbnails, it's just that massive. It ballooned to that size after downloading the PTR. I'll see if I can reproduce it just rebuilding off the PTR and, if so, I'll upload the result.
Update: I've been assuming you want the client.mappings.db
(~60GB), client.master.db
(~20GB), client.caches.db
(~6GB), and client.db
. If you only want client.db
, that's only 2GB. Or if I'm misunderstanding entirely, please let me know :)
I would need at least mappings and client.db, but all would be preferable. Otherwise, I will have to fake the rest of the database or something.
You can try with PTR but AFAIK there are other people using hybooru with large databases synced with PTR without this particular issue.
Should be fixed as of v1.10.0 Let me know if you are still experiencing this issue.
Hi there. First: thank you for your work on this project.
Unfortunately, I've been encountering an error when rebuilding the database this past week. Log output is below. Might be a consequence of excessive tag sibling/parent loops.
I addressed it by adding
ON CONFLICT DO NOTHING
toapplyTagSiblings
, which matches all of the otherINSERT INTO ... SELECT ... FROM ...
statements in tag-related functions in that file.In a couple days, if this turns out to be a sufficient fix, I can make a PR if you'd like.
Log output
``` Posts (153553/153553) [####################] Done in 3.24s Posts (3299/3299) [####################] Done in 45ms Posts (2470/2470) [####################] Done in 61ms Posts (628/628) [####################] Done in 25ms Urls (601716/601716) [####################] Done in 2.02s Notes [####################] Done in 0ms Tags (39961003/39961003) [####################] Done in 2:04 Mappings (2031885282/2031885282)[####################] Done in 53:36 Mappings (1205433/1205433) [####################] Done in 19.84s Mappings (45176/45176) [####################] Done in 253ms Tag parents (29235/29235) [####################] Done in 156ms Tag siblings (302637/302637) [####################] Done in 1.59s Importing options... [####################] Done in 12ms Resolving file relations... [####################] Done in 103ms Normalizing tags (6/6) [####################] Done in 46:12:14 Warning! Detected 71 loops in tag siblings! Warning! Detected 46 loops in tag parents! Resolving tag parents... [####################] Done in 36:40:41 Indexing (24/24) [####################] Done in 3:25:06 Resolving tag siblings... [--------------------] Unable to initialize DB! error: duplicate key value violates unique constraint "tag_postids_tagid_idx" at Parser.parseErrorMessage (/repos/hydribooru/node_modules/pg-protocol/src/parser.ts:369:69) at Parser.handlePacket (/repos/hydribooru/node_modules/pg-protocol/src/parser.ts:188:21) at Parser.parse (/repos/hydribooru/node_modules/pg-protocol/src/parser.ts:103:30) at Socket.