rishooty / vrec-dat-filter

Filters romset .dats based on V's Recommended Wiki
MIT License
10 stars 0 forks source link

Game_Boy_Advance not scraping titles with color in name. #1

Closed sahaq closed 6 years ago

sahaq commented 6 years ago

Using Nintendo - Game Boy Advance Parent-Clone (20170813-204027).dat as base, via dat-o-matic P/C XML.

running the command (windows 10) "vrec.exe main Game_Boy_Advance "Nintendo - Game Boy Advance Parent-Clone (20170813-204027).dat"" results in an 8kb file, which contains only the entries that are not 'colored' in the listings on the wiki page. I haven't seen this issue on the other consoles I have checked (NES/GBC/GB/SNES), so it may be something to do with the formatting on that specific page.

rishooty commented 6 years ago

Can absolutely confirm, I actually thought it had more to do with the inherent flaws of fuzzy matching, as the weird systems are usually solved by lowering --accuracy to 50 or 55ish. However, I noticed I had to do most of gba manually yesterday and was wondering why that was the black sheep.

When you pointed this out I opened up listTemp.csv and sure enough it only has those entries. Your guess to the issue is also correct: <font color="#39b54a">Golden Sun: The Lost Age</font>, as Chrome's inspection confirmed it's using yet another field within , as opposed to <th>Guilty Gear X - Advance Edition</th>

Will get to fixing as it made yesterday a real pain in the ass, thanks for pointing this out.

rishooty commented 6 years ago

The spider now grabs both formatted and unformatted text. Tested with gameboy advance on all environments with success, marking as solved.

2017-08-15 21:29:49 [scrapy.core.scraper] DEBUG: Scraped from <200 http://vsrecommendedgames.wikia.com/wiki/Game_Boy_Advance> {'games': ['Description', 'CT Special Forces 3: Bioterror', 'Demikids: Dark Version', 'Guilty Gear X - Advance Edition', 'Kuru Kuru Kururin series', 'Medabots: Metabee / Rokusho Version', '/ Gundam Seed Destiny', "Napoleon / L'Aigle de Guerre", 'Naruto - Ninja Council', 'Racing Gears Advance', 'Robopon 2: Cross Version', 'Spyro: Attack of the Rhynocs', 'Sword of Mana', 'Advance Guardian Heroes', 'Advance Wars', 'Advance Wars 2: Black Hole Rising', 'Alien Hominid', 'Aladdin', 'Astro Boy: Omega Factor', "Banjo-Kazooie: Grunty's Revenge", 'Banjo-Pilot', 'Boktai: The Sun is in Your Hand', 'Boktai 2: Solar Boy Django', 'Bomberman Max 2: Blue Advance/Red Advance', 'Bomberman Tournament', 'Boulder Dash EX', 'Breath of Fire', 'Breath of Fire II', 'Bubble Bobble: Old & New', 'Car Battler Joe', 'Castlevania: Aria of Sorrow', 'Castlevania: Circle of the Moon', 'Castlevania: Harmony of Dissonance', 'Chu Chu Rocket!', 'Crash Bandicoot: The Huge Adventure', 'Crash Bandicoot 2: N-Tranced', 'CT Special Forces 2: Back to Hell / Back in the Trenches', 'Demikids: Light Version', 'Digimon Battle Spirit 2', 'DK: King of Swing', 'Donkey Kong Country', 'Donkey Kong Country 2', 'Donkey Kong Country 3', 'Double Dragon Advance', 'Dr. Mario & Puzzle League', "Dragon Ball Z: Buu's Fury", 'Dragon Ball Z: Supersonic Warriors', 'Dragon Ball: Advanced Adventure', 'Drill Dozer', 'Egg Mania', 'F-Zero: GP Legend', 'F-Zero: Maximum Velocity', 'Final Fantasy I & II: Dawn of Souls', 'Final Fantasy IV Advance', 'Final Fantasy V Advance', 'Final Fantasy VI Advance', 'Final Fantasy Tactics Advance', 'Final Fight One', 'Fire Emblem', 'Fire Emblem: The Sacred Stones', 'Fire Pro Wrestling series', 'Game & Watch Gallery 4', 'Godzilla: Domination!', 'Golden Sun', 'Golden Sun: The Lost Age', 'Gradius Galaxies', 'Gunstar Super Heroes', 'Hamtaro: Ham-Ham HeartBreak', 'Hamtaro: Ham-Ham Games', 'Harvest Moon: Friends of Mineral Town', 'Harvest Moon: More Friends of Mineral Town', 'Iridion II', "It's Mr. Pants", 'Justice League Heroes: The Flash', 'King of Fighters EX2', 'Kingdom Hearts: Chain of Memories', 'Kirby & The Amazing Mirror', 'Kirby: Nightmare In Dream Land', 'Klonoa: Empire of Dreams', 'Klonoa 2: Dream Champ Tournament', 'Konami Krazy Racers', 'Legend of Zelda, The: A Link to the Past & Four Swords', 'Legend of Zelda, The: The Minish Cap', 'Lunar Legend', 'Mario & Luigi: Superstar Saga', 'Mario Golf: Advance Tour', 'Mario Kart: Super Circuit', 'Mario Tennis: Power Tour', 'Mario vs. Donkey Kong', 'Medabots AX: Metabee / Rokusho Version', 'Mega Man & Bass', 'Mega Man Battle Network', 'Mega Man Battle Network 2', 'Mega Man Battle Network 3 Blue', 'Mega Man Battle Network 4', 'Mega Man Battle Network 5', 'Mega Man Battle Network 6', 'Mega Man Zero', 'Mega Man Zero 2', 'Mega Man Zero 3', 'Mega Man Zero 4', 'Metal Slug Advance', 'Metroid Fusion', 'Metroid: Zero Mission', 'Mobile Suit Gundam Seed: Battle Assault', 'Monster House', 'Monster Rancher Advance 2', 'Mr. Driller 2', 'Ninja Five-O', 'One Piece', 'Payback', 'Phalanx: The Enforce Fighter A-144', 'Pinball of the Dead', 'Pokémon Ruby & Sapphire', 'Pokémon Emerald', 'Pokémon FireRed & LeafGreen', 'Pokémon Mystery Dungeon: Red Rescue Team', 'Pokémon Pinball: Ruby & Sapphire', 'Prince of Persia: The Sands of Time', 'Puyo Pop', 'Puyo Pop Fever', 'Rebelstar: Tactical Command', 'River City Ransom EX', 'Riviera: The Promised Land', 'Robopon 2: Ring Version', 'Sabre Wulf', 'Scurge: Hive', 'Shaman King: Master of Spirits 2', 'Shining Force: Resurrection of the Dark Dragon', 'Shining Soul II', 'Sigma Star Saga', 'Sonic Advance', 'Sonic Advance 2', 'Sonic Advance 3', 'Sonic Battle', 'Sonic Pinball Party', "Spider-Man: Mysterio's Menace", 'Street Fighter Alpha 3 Upper', 'Summon Night: Swordcraft Story', 'Summon Night: Swordcraft Story 2', 'Super Dodge Ball Advance', "Super Ghouls 'N Ghosts", 'Super Mario Advance', 'Super Mario Advance 2: Super Mario World', "Super Mario Advance 3: Yoshi's Island", 'Super Mario Advance 4: Super Mario Bros. 3', 'Super Monkey Ball Jr.', 'Super Puzzle Fighter II', 'Super Robot Taisen: Original Generation', 'Super Robot Taisen: Original Generation 2', 'Tactics Ogre: The Knight of Lodis', 'TMNT', 'The Tower SP', "Tony Hawk's Pro Skater 2", 'Ultimate Muscle: The Path of the Superhero', "Wade Hixton's Counter Punch", 'Wario Land 4', 'WarioWare, Inc.: Mega Microgame$!', 'WarioWare: Twisted', 'Yggdra Union: We’ll Never Fight Alone', 'Zone of the Enders: The Fist of Mars']}