andrebrait / 1g1r-romset-generator

A small utility that uses No-Intro DATs to generate 1G1R ROM sets
GNU General Public License v3.0
213 stars 20 forks source link

No P/Clone XML file #10

Open mspykerez opened 4 years ago

mspykerez commented 4 years ago

Hello there good fella! Love your tool, very useful but I have a problem/question. Many of the game systems listed on datomatic.no-intro.org have not the proper P/Clone XML file which makes your tool not finding any 'duplicates'! As an example I have a no-intro Commodore - Amiga Set that has many 'duplicates' such as Theme Park Mystery (Europe).zip // Theme Park Mystery (USA).zip

How do I fix this? Perhaps you can add an option to 'filter by file name/tag' or something without checking any DAT file. The closest thing I could find was this here https://github.com/NeoVidia/romfilter but I'd prefer something with your sorting algorithm.

My many thanks to you!

andrebrait commented 4 years ago

Well, it's technically possible. But it will still fail for anything that's not strictly the name of the ROM and it will fail if the name is localized, of course.

But yes, it could be done

mspykerez commented 4 years ago

Nice! I will be waiting for it.

HVR88 commented 4 years ago

Simple name matching is pretty easy, but on systems with large libraries game names are often changed. Like Andre mentioned there are localized language use, localized conventions and then just marketing reasons.

Examples: Football vs. Soccer 2002 vs. 02 Harry Potter and the Philosopher's Stone vs. Harry Potter and the Sorcerer's Stone (USA only) etc.

These differences require a managed/curated list of titles, whether in P/Clone XML format, a CSV or other database.

For Amiga collectionistas in particular, I would strongly advise simply to not try to do this programmatically and instead obtain a different set, such as one documented by WHDLoad initiative.

mspykerez commented 4 years ago

Would be nice to support other popular preservation projects as well like GoodTools, Redump and TOSEC.

GoodTools has it's own naming convention here http://emulation.gametechwiki.com/index.php/GoodTools Redump and TOSEC has DAT files and for TOSEC their naming convention can be found here https://www.tosecdev.org/tosec-naming-convention

andrebrait commented 4 years ago

The DAT format is well-supported, I think. The naming convention not so much, indeed.

It's been added to the list of features for 2.0. I'm rewriting it quite a bit, so it will be a much simpler release to manage overall

HVR88 commented 4 years ago

Redump DAT files don't have any parent/clone information and don't include any language info for titles that only have one language - so English title from Japan doesn't get marked English.

In summary, Redump DAT files are useless for working on files except if you want/need to verify checksums - and if the title loads/works, it doesn't matter what the checksum is.

As far as naming goes, I regularly see No-Intro style names that have been mangled, so it doesn't matter what the set is targeting, there will always be names that don't adhere to standards.

andrebrait commented 4 years ago

Ideally, the DATs would use the fields that they already contain and have the region, language, date, etc. information there and I would not have to parse anything.

The mystery is why this information is present in the names but not properly set in the DATs.

Game (USA, Europe) has regions USA and EUR and it most certainly has English language Game (Europe) (en+fr, es, de, nl) has region EUR and languages English + French, Spanish, German and Dutch.

Yes, of course, I can parse that, but since the DAT has fields for that data, why is it not there? I never got exactly how they maintain their database.

HVR88 commented 4 years ago

Somewhat related question on filename/title matching... Is there currently any mechanism for detecting a match between a standalone game and the same game as part of a 2-in-1 (or 3 or 4-in-1, etc.)? Examples include all the "n Games in 1" for GBA. So not technically a clone and won't appear tagged with a parent, but the same games nonetheless.

Something to consider if it's not possible now. Then one would be able to hopefully eliminate the n-in-1 ROM if all the standalones are present, or keep it if they're not.

mspykerez commented 4 years ago

I do not know about that but please don't mess with games that have multiple images like part1, part2, etc.