Voljega / ExoDOSConverter

a custom game converter from the ExoDOS collection to emulation station based distribution format
153 stars 12 forks source link

Minor rework of mapping logic #79

Closed Newsdee closed 3 years ago

Newsdee commented 3 years ago

Changes

Test code


class MockDosGame:
    def __init__(self, value):
        self.genres = [value]

mdh = MetadataHandler(r'c:\temp\\', 2, False, False)
print("hello")
for key in GAME_MAPPER.keys():
    oldValue = mdh.buildGenre_old(MockDosGame(key))
    newValue = mdh.buildGenre(MockDosGame(key))
    if oldValue != newValue:
        print ("[%s]\n    diff: '%s' vs. '%s'" % (key, oldValue, newValue))

Results


# Fixed: "Gestion" is French

[Strategy]
    diff: 'Strategy-Gestion' vs. 'Strategy-Management'
[Construction and Management Simulation]
    diff: 'Strategy-Gestion' vs. 'Strategy-Management'

# Fixed:  in English, 'Racing' is more often used than 'Race' (like Course in French)

[Driving]
    diff: 'Race' vs. 'Racing'
[Racing / Driving]
    diff: 'Race' vs. 'Racing'
[Racing]
    diff: 'Race' vs. 'Racing'

# Fixed: 'IF' is usually text adventures (not sure if mapping was intentional)

[Interactive Fiction]
    diff: 'Adventure-Visual' vs. 'Adventure-Text'

# Fixed: found an unhandled case
[Managerial]
    diff: 'Unknown' vs. 'Strategy-Management'

Newsdee commented 3 years ago

I compared against Exo 4.0, refined the mapping, and we now have these differences:

+---+----------+-------------------------------------------------------------------+------------------+---------------------+
| 0 | colorado | ['Action', 'Adventure', 'Simulation']                             | Action-Adventure | Simulation          |
| 1 | corporat | ['Action', 'First Person Shooter', 'Puzzle']                      | Puzzle           | Gun-FPS             |
| 2 | thempamy | ['Adventure', 'Arcade']                                           | Adventure-Visual | Action-Adventure    |
| 3 | execsui  | ['Construction and Management Simulation', 'Interactive Fiction'] | Adventure-Visual | Strategy-Management |
| 4 | monteret | ['Action', 'First Person Shooter', 'Platform', 'Puzzle']          | Puzzle           | Gun-FPS             |
| 5 | eradicat | ['Action', 'First Person Shooter', 'Puzzle']                      | Puzzle           | Gun-FPS             |
| 6 | enigma96 | ['Arcade', 'Paddle / Pong']                                       | Misc             | Action-Adventure    |
+---+----------+-------------------------------------------------------------------+------------------+---------------------+

I think these are batter categories, but we can override if you want to revert to the previous one.

Voljega commented 3 years ago

I compared against Exo 4.0, refined the mapping, and we now have these differences:

+---+----------+-------------------------------------------------------------------+------------------+---------------------+ | 0 | colorado | ['Action', 'Adventure', 'Simulation'] | Action-Adventure | Simulation | | 1 | corporat | ['Action', 'First Person Shooter', 'Puzzle'] | Puzzle | Gun-FPS | | 2 | thempamy | ['Adventure', 'Arcade'] | Adventure-Visual | Action-Adventure | | 3 | execsui | ['Construction and Management Simulation', 'Interactive Fiction'] | Adventure-Visual | Strategy-Management | | 4 | monteret | ['Action', 'First Person Shooter', 'Platform', 'Puzzle'] | Puzzle | Gun-FPS | | 5 | eradicat | ['Action', 'First Person Shooter', 'Puzzle'] | Puzzle | Gun-FPS | | 6 | enigma96 | ['Arcade', 'Paddle / Pong'] | Misc | Action-Adventure | +---+----------+-------------------------------------------------------------------+------------------+---------------------+

The first and last one are not good (which can be acceptable though, it's only 2 over 7200) , and I actually fail to understand how the last one can be categorized as Action-Adventure ?

Newsdee commented 3 years ago

Here is the output with the latest changes. I hope the code is simpler, it's getting close than your original implementation...

+---+----------+-------------------------------------------------------------------+------------------+---------------------+
| 0 | corporat | ['Action', 'First Person Shooter', 'Puzzle']                      | Puzzle           | Gun-FPS             |
| 1 | execsui  | ['Construction and Management Simulation', 'Interactive Fiction'] | Adventure-Visual | Strategy-Management |
| 2 | monteret | ['Action', 'First Person Shooter', 'Platform', 'Puzzle']          | Puzzle           | Gun-FPS             |
| 3 | eradicat | ['Action', 'First Person Shooter', 'Puzzle']                      | Puzzle           | Gun-FPS             |
| 4 | enigma96 | ['Arcade', 'Paddle / Pong']                                       | Misc             | Action-Adventure    |
+---+----------+-------------------------------------------------------------------+------------------+---------------------+

About the last one, there are many games that are ['Action', 'Paddle / Pong'] going into Action-Adventure, so I took "Arcade" to mean the same as 'Action'.

Arguably though, maybe we may want to put them all in 'misc'? In any case, I simplified the test case file, so now it's easier to try the difference of making changes and see if there's anthing we'd like to reclassify (e.g. duke nukem, is it a shmup or platformer? in old version it's a shmup)

Newsdee commented 3 years ago

I had the idea to add an explicit multi-genre override (MULTI_GENRE_MAPPER). I think this makes it much easier to filter out a handful of cases, making the overall logic clearer (I hope!)

I still left some overall overiddes in buildGames (e.g. Puzzle) when it impacts many cases (otherwise it needs lots of entries in the mapper).

Now the diffs look like this:

+---+----------+-------------------------------------------------------------------+------------------+---------------------+
| 0 | corporat | ['Action', 'First Person Shooter', 'Puzzle']                      | Puzzle           | Gun-FPS             |
| 1 | universa | ['Action', 'Construction and Management Simulation', 'Shooter']   | ShootEmUp        | Strategy-Management |
| 2 | starsag1 | ['Board / Party Game', 'Role-Playing']                            | Puzzle           | RPG                 |
| 3 | execsui  | ['Construction and Management Simulation', 'Interactive Fiction'] | Adventure-Visual | Strategy-Management |
| 4 | monteret | ['Action', 'First Person Shooter', 'Platform', 'Puzzle']          | Puzzle           | Gun-FPS             |
| 5 | eradicat | ['Action', 'First Person Shooter', 'Puzzle']                      | Puzzle           | Gun-FPS             |
| 6 | enigma96 | ['Arcade', 'Paddle / Pong']                                       | Misc             | Action-Adventure    |
+---+----------+-------------------------------------------------------------------+------------------+---------------------+

We can remove the diffs individually by adding a multi mapper override.

Voljega commented 3 years ago

I had the idea to add an explicit multi-genre override (MULTI_GENRE_MAPPER). I think this makes it much easier to filter out a handful of cases, making the overall logic clearer (I hope!)

I still left some overall overiddes in buildGames (e.g. Puzzle) when it impacts many cases (otherwise it needs lots of entries in the mapper).

Now the diffs look like this:

+---+----------+-------------------------------------------------------------------+------------------+---------------------+ | 0 | corporat | ['Action', 'First Person Shooter', 'Puzzle'] | Puzzle | Gun-FPS | | 1 | universa | ['Action', 'Construction and Management Simulation', 'Shooter'] | ShootEmUp | Strategy-Management | | 2 | starsag1 | ['Board / Party Game', 'Role-Playing'] | Puzzle | RPG | | 3 | execsui | ['Construction and Management Simulation', 'Interactive Fiction'] | Adventure-Visual | Strategy-Management | | 4 | monteret | ['Action', 'First Person Shooter', 'Platform', 'Puzzle'] | Puzzle | Gun-FPS | | 5 | eradicat | ['Action', 'First Person Shooter', 'Puzzle'] | Puzzle | Gun-FPS | | 6 | enigma96 | ['Arcade', 'Paddle / Pong'] | Misc | Action-Adventure | +---+----------+-------------------------------------------------------------------+------------------+---------------------+

We can remove the diffs individually by adding a multi mapper override.

Actually all these diffs seem good except the last one, but that's ok it's not like the original mapped genres were all correct anyway

One last thing maybe we could externalise all the genre mapping mecanism in a GenreMapper class, genremapper.py file ?

If you want to do it, I'm ok with it, otherwise I'll do it later, tell me

Newsdee commented 3 years ago

Good idea, I'll externalize to a new file and fix that one last diff.