rommapp / romm

A beautiful, powerful, self-hosted rom manager
https://romm.app
GNU Affero General Public License v3.0
1.88k stars 87 forks source link

[Bug] IGDB metadata with unicode characters causes errors #943

Closed Casuallynoted closed 1 month ago

Casuallynoted commented 3 months ago

Unicode characters (japanese characters, special european characters, combination characters) produce SQL erorrs when IGDB items with those characters in the metadata are scanned.

Examples:

Scan failed: (mariadb.OperationalError) Incorrect string value: '\xCE\xB2 Wor...' for column romm.roms.summary at row 1 [SQL: INSERT INTO roms (igdb_id, sgdb_id, moby_id, file_name, file_name_no_tags, file_name_no_ext, file_extension, file_path, file_size_bytes, name, slug, summary, igdb_metadata, moby_metadata, path_cover_s, path_cover_l, url_cover, revision, regions, languages, tags, path_screenshots, url_screenshots, multi, files, platform_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?, ?)] [parameters: (11394, None, None, 'Steins;Gate 0 (USA)', 'Steins;Gate 0', 'Steins;Gate 0 (USA)', '', 'psvita/roms', 1630759715, 'Steins;Gate 0', 'steins-gate-0', 'Steins;Gate 0 is a Japanese visual novel. It is the fifth game in the Science Adventure series and the sequel to Steins;Gate. Like Steins;Gate, the game is described as a "hypothetical science ADV". The player assumes the role of Okabe Rintaro in the β World Timeline.', '{"total_rating": "85.84", "aggregated_rating": "87.18", "first_release_date": 1449705600, "genres": ["Adventure", "Visual Novel"], "franchises": ["St ... (3129 characters truncated) ... "//images.igdb.com/igdb/image/upload/t_thumb/co6bl6.jpg"}, "name": "Steins;Gate: Heni Kuukan no Octet", "slug": "steins-gate-heni-kuukan-no-octet"}]}', '{}', 'psvita/Steins%3BGate%200/cover/small.png', 'psvita/Steins%3BGate%200/cover/big.png', 'https://images.igdb.com/igdb/image/upload/t_thumb/co24ua.jpg', '', '["USA"]', '[]', '[]', '["psvita/Steins%3BGate%200/screenshots/0.jpg", "psvita/Steins%3BGate%200/screenshots/1.jpg", "psvita/Steins%3BGate%200/screenshots/2.jpg", "psvita/Steins%3BGate%200/screenshots/3.jpg"]',

For this one, the issue is this bit:

"The player assumes the role of Okabe Rintaro in the β World Timeline."

Specifically, the "β"

  1. Same error but different game and this bit causing the problem: "\xEF\xAC\x81rst"

The problematic character is in the section: "Day of the Tentacle was Tim Schafer’s first game as co-project lead" the "fi" is actually a combined single-character rather than an f and i

This description:

Guru Guru Onsen 3 (ぐるぐる温泉3) is a table game for the Sega Dreamcast.

Produces issues due to the presence of japanese characters.

gantoine commented 3 months ago

@Casuallynoted This might be related to your filesystem, or maybe mariadb version? I just made a fake ZIP for the game and uploaded it, and it fetched the summary correctly.

Screenshot 2024-06-22 at 10 18 11 AM
Casuallynoted commented 3 months ago

Interesting, I haven't had this issue on this machine previously so maybe it's mariadb? I'm using this image: ghcr.io/mariadb/mariadb:10.4.34-focal

Casuallynoted commented 1 month ago

Forgot to mention this issue is fixed- the db was made using the wrong collation.