sharkusmanch / playnite-pcgamingwiki-metadata-provider

Plugin for Playnite that retrieves game metadata from PCGamingWiki
MIT License
11 stars 3 forks source link

Apostrophe in title occasionally breaks search #40

Closed chocolatechipcats closed 7 months ago

chocolatechipcats commented 1 year ago

Possibly related: #28

Steps to replicate:

  1. Create a game with title Sid Meier's Civilization II.
  2. Use Download Metadata > PCGamingWiki. Zero results.
  3. Remove Sid Meier's from the search. The first result is... Sid Meier's Civilization II.

I can also replicate this with Sid Meier's Railroads! and Sid Meier's Alpha Centauri.

This does not always occur; Assassin's Creed got results as expected. Perhaps the search has something against Sid Meier.

sharkusmanch commented 1 year ago

Was able to reproduce this with the games mentioned

Acru commented 1 year ago

Oh dear, I came about this issue as well and am sad to see its been known a while with no fix.

Lets see if we can correct that. And I just noticed that some of the affected titles are using a single quote (’)[U+2019] instead of an apostrophe (')[U+0027] when searching, causing the search to fail. The others aren't so clear, but replacing the apostrophe with a quotation mark (")[U+0022] seems to usually help the search.

Fail due to single quote: Line 15: Assassin’s Creed Chronicles India Line 16: Assassin’s Creed Chronicles Russia Line 128: "Please, Don’t Touch Anything" Line 168: Shadow Tactics: Blades of the Shogun - Aiko’s Choice Line 170: Shadowrun: Dragonfall - Director’s Cut

The rest of the affected titles: [Results are from a batch process, so some titles may just not exist in PCGW's database] Line 6: "8Doors: Arum's Afterlife Adventure" Line 7: "A Mortician's Tale" Line 8: "Alan Wake's American Nightmare" Line 9: "Alder's Blood Prologue" Line 30: "Broken Sword: Director's Cut" Line 48: "Discovery Tour by Assassin's Creed: Ancient Egypt" Line 58: "Fiendish Freddy's Big Top o' Fun" Line 84: "Ken Follett's The Pillars of the Earth" Line 88: "King Arthur's Gold" Line 89: "LEGO Builder's Journey" Line 92: "Lone Survivor: The Director's Cut" Line 112: "No Man's Sky" Line 114: "Oddworld: Abe's Oddysee" Line 115: "Ollie & Bollie's Outdoor Estate" Line 123: "Penny Arcade's On the Rain-Slick Precipice of Darkness 3" Line 124: "Penny Arcade's On the Rain-Slick Precipice of Darkness 4" Line 131: "Q.U.B.E.: Director's Cut" Line 135: "Recettear: An Item Shop's Tale" Line 138: "Retired Men's Nude Beach Volleyball" Line 171: "Shadowrun: Dragonfall - Director's Cut" Line 173: "Shantae and the Pirate's Curse" Line 174: "Shantae: Risky's Revenge - Director's Cut" Line 177: "Sid Meier's Civilization III: Complete" Line 178: "Sid Meier's Civilization VI" Line 204: "Super Lucky's Tale" Line 211: "Teenage Mutant Ninja Turtles: Shredder's Revenge" Line 212: "The Beginner's Guide" Line 215: "The Lion's Song" Line 223: "Tiny and Big: Grandpa's Leftovers" Line 224: "Tiny Tina's Assault on Dragon Keep: A Wonderlands One-shot Adventure" Line 226: "Tom Clancy's Ghost Recon" Line 227: "Tom Clancy's Ghost Recon Wildlands" Line 228: "Tom Clancy's Splinter Cell Chaos Theory" Line 241: "Viscera Cleanup Detail: Santa's Rampage"

Acru commented 1 year ago

Some more things I noticed;

For just about any symbol or space, you can replace it by any number of symbols (!@#$%^&*;: and others) or spaces and the search will still work, implying that the api is treating any such sequence as a single whitespace.

For the apostrophe games that are found successfully, there can be any number of spaces or symbols instead of an apostrophe (or any other space or symbol), and the name is still found.

For the apostrophe games that are not found in a search, however, the search will only work if there is a quotation mark somewhere before where the apostrophe appears.

Eg: Alan"Wake s American Nightmare will be found when manually searching with the plugin, but; Alan Wake s American"Nightmare will not.

There is something different about the PCGW data of just these unmatched games that is messing up the search, and I am guessing that the apostrophe is directly stored instead of a whitespace marker (an underscore) , but in any case it may be simplest to fix by internally quoting all searches; "Alan Wake's American Nightmare" which seems to work in all cases that I tested, and it found more matches in general, for some titles.

("Oddworld: New 'n' Tasty" found three matches whereas unquoted it only found two matches, for example.)

Acru commented 1 year ago

I tried implementing this myself, changing the uses of client.SearchGames(X) to client.SearchGames("\"" + X + "\"") and it seems to work. Would probably have issues if the name itself happens to use quotation marks though, and as such internal quotes should be replaced by spaces. Will test further.

Edit: Also notice SearchGames() is also trying to add quotes to the searched string, though I'm not familiar with that syntax; $"\"{NormalizeSearchString(searchName)}\"" Removing both those and my added quotes also fixes apostrophe searches, though surely the quotes were added for a particular reason in the first place?

Acru commented 1 year ago

Removing all the quotes also allowed Sorcery! Parts 1 & 2 to match to Sorcery! Parts 1 and 2, perhaps PCGW changed things on their end so quotes aren't needed anymore?

sharkusmanch commented 1 year ago

Thanks for taking a look! I will try to set aside some time this week to dig more into this.

Yea, I originally add that quoting to resolve https://github.com/sharkusmanch/playnite-pcgamingwiki-metadata-provider/issues/28

Acru commented 1 year ago

Yea, I originally add that quoting to resolve #28

I tried searching The Signal from Tölva in my quotes-removed build and it didn't find a match, though I did notice something rather odd...

Editing the search to The Signal from also finds nothing where it ought to work, but searching for The Signal Tölva or even just Tölva finds the correct match, despite the accented character.

[Btw, I also modified NormalizeSearchString to return search.Replace('-', ' ').Replace('\'', ' ').Replace('’', ' '); though there are probably better ways to do~ The reason for this is a few cases where the game name uses apostrophe where PCGW uses single quote, or vice versa.]

Acru commented 1 year ago

More info from when I was manually matching some 70 games with editions;

Death Ray Manta SE, Fallout: New Vegas - Ultimate Edition, and Jotun: Valhalla Edition won't match unless within quotes, though no nonascii characters are involved, and Lone Survivor: The Director's Cut won't match with or without quotes as it includes an apostrophe, though will match from "Lone Survivor: The Director"

Acru commented 1 year ago

Here is an odd one which I have a feeling might be a clue pointing to the issue, searching "Tiny and Big: Grandpa" matches to Tiny and Big: Grandpa's Leftovers but the metadata comes back as Tiny & Big in Grandpa's Leftovers (with the literal codes), where searching Tiny & Big: Grandpa's Leftovers returns Tiny & Big in Grandpa's Leftovers and metadata without the codes.

sharkusmanch commented 7 months ago

Fixed in v1.2.4