Closed drzraf closed 1 year ago
Hello,
I think the index (tag_name
, and maybe tag_lower_name
) is broken. Can you try to:
You can check the index collation using:
WITH defcoll AS (
SELECT datcollate AS coll
FROM pg_database
WHERE datname = current_database()
)
SELECT icol.pos,
CASE WHEN c.collname = 'default'
THEN defcoll.coll
ELSE c.collname
END AS collation
FROM pg_index AS i
CROSS JOIN unnest(i.indcollation) WITH ORDINALITY AS icol(coll, pos)
CROSS JOIN defcoll
LEFT JOIN pg_collation AS c ON c.oid = icol.coll
WHERE i.indexrelid = 'tag_name'::regclass
ORDER BY icol.pos;
Regarding index collation
:
pos | collation
-----+-------------
1 | en_US.UTF-8
Regarding REINDEX
:
REINDEX (VERBOSE) INDEX tag_lower_name;
INFO: index "tag_lower_name" was reindexed
DETAIL: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.04 s
REINDEX
REINDEX (VERBOSE) INDEX tag_name;
ERROR: could not create unique index "tag_name"
DETAIL: Key (name)=(sacrifici animali) is duplicated.
SELECT * from tag where name like '%sacrifici animal%';
id | name
------+-------------------
1334 | sacrifici animali
2036 | sacrifici animali
(2 rows)
SELECT * from tag where name = 'sacrifici animali';
id | name
------+-------------------
1334 | sacrifici animali
(1 row)
I've tons of dup's... wtf?
SELECT name, count(1) as c from tag group by name HAVING count(1) > 1 ORDER BY c DESC;
name | c
--------------------------+---
animali | 8
Animali | 6
galline | 5
Vegan | 5
carne | 5
cina | 5
pollo | 4
crudeltà | 4
polli | 4
animal | 3
...
I think your index was broken. I suggest to remove duplicates and recreate the index
Yes, you can try
DELETE FROM "tag" v1 USING (SELECT MIN(id) as id, "name" FROM "tag" GROUP BY "name" HAVING COUNT(*) > 1) v2 WHERE v1."name" = v2."name" AND v1.id <> v2.id
Thank you.
I ended up using a combo of
SELECT CONCAT(id, ',', name) FROM tag WHERE name IN (SELECT name FROM tag GROUP by name HAVING count(name) > 1) ORDER BY name, id ASC
piped to
pname= pid=; while IFS=, read id name; do [[ $pname = $name ]] && echo "UPDATE \"videoTag\" SET \"tagId\" = $pid WHERE \"tagId\" = $id;" && continue; pname=$name; pid=$id; done
to generate UPDATE SQL statements and do the videoTag
deduplication.
I then rebuilt the INDEX and the next import attempts succeeded.
Describe the current behavior
I'm experiencing a problem similar to #5072 (although the error is slighly disting) using v5.0.0 when retring the synchronization of a Youtube channel (multiple videos failed during previous attempts)
During import I get the below
SequelizeUniqueConstraintError
In a first attempt to workaround the issue I did
But next import attempt triggered an identical on another video/tag tuple (tag =
asia
) what stopped the sync again: This time it shows symptoms of a collation issue (at first glance at least):Using
docker run -it --rm --network peertube_default postgres psql -U peertube -h postgres peertube_prod
SELECT count(1) FROM tag where name = 'asia' ;
// 0SELECT count(1) FROM tag where unaccent(name) = 'asia' ;
// 1SELECT count(1) FROM tag where name = 'asia' COLLATE "C";
// 1The existing tag is associated with a successfully imported video, but resuming the import leads to the
SequelizeUniqueConstraintError
(I may be missing something (I'm not used with pgsql) but even after UPDATEing the value to plain ascii, I still can't
SELECT
it using a simpleWHERE name = 'asia'
clause)Steps to reproduce
(Please note that logs gives no clue about which video is at cause)
Describe the expected behavior
SequelizeUniqueConstraintError
(and continue with the next one)Fully synchronize the channel (This fetches any missing videos on the local channel)
this should recreate jobs only for the failed ones (not create tons of new jobs)Additional information
PeerTube instance:
Browser name, version and platforms on which you could reproduce the bug: Firefox
Link to browser console log if relevant:
Link to server log if relevant (
journalctl
or/var/www/peertube/storage/logs/
):SHOW SERVER_ENCODING;
UTF8\encoding
UTF8\dc
empty\dO
empty