opengaming / osgameclones

Open Source Clones of Popular Games
https://osgameclones.com/
Other
1.66k stars 306 forks source link

Populate genre, subgenre & theme meta for all games #366

Closed nikuda closed 7 years ago

nikuda commented 7 years ago

Volunteers needed.

It mostly involves going to www.giantbomb.com/games, searching for the game and then noting down the genre and theme entries from the info box on the right side.

screen shot 2016-10-10 at 11 12 08 am

If GiantBomb doesn't have a record of the game then use a combination of wikipedia game description and pick a fitting genre from:

See schema.yaml file for list of genres, subgenres and themes. The list of genres should stay more or less the same, the list of subgenres is the one that should be expanded if need be.

The meta format looks like this:

- name: Game name
  meta:
    genre: [Role-Playing]
    subgenre: [Roguelike]
    theme: [Sci-Fi, Horror]
  clones: ...
  ...
piranha commented 7 years ago

Maybe with a script? Could make errors which hard to notice though...

tukkek commented 7 years ago

I'm up for doing the genre gathering as I mentioned when opening #353. I guess I would just change the games.yaml with the data?

Also your list of genres and sub-genres sound pretty arbitrary. Real-time strategy and strategy make no sense as both top categories. There isn't a "turn based strategy" or "third person shooter" anywhere. "Beat ''Em Up" and "Shoot ''Em Up" aren't really input friendly at all. I'd like to know beforehand if you're OK with me doing a complete overhaul on these, based around what I come up during the gathering, as opposed to the predefined, arbitrary list we have now.

tukkek commented 7 years ago

Also it's not clear to me if a game can have more than 1 genre and/or more than 1 sub-genre.

EDIT: I see now the 3 meta items seem to be arrays. Would I be correct in assuming all of these could receive multiple values?

tukkek commented 7 years ago

Finally, just to make clear: I'm volunteering to gather genre (and subgenre when needed) for every single one of the games currently on the list. It's a lot of work so let me know if there isn't anybody working on it already as far as you know because it will certainly take a lot of browsing and copy-pasting YAML lines :P Really wouldn't want to see the time put into it get wasted.

cxong commented 7 years ago

I think it's better to treat all genres equally, i.e. no subgenre hierarchy. Taxonomy is a hard problem and evolves all the time for games, so rather than try to solve that, we can just tag with whatever genres are most relevant.

For example, Mario used to be described under the very broad genre of "action game" but today we'd probably just say it's a platformer. Also, technically roguelikes are a subgenre of RPG but I think most people don't think of roguelikes when they say "RPG".

nikuda commented 7 years ago

@cxong I agree, though subgenres as I've set them up here are mostly used as a visibility thing.

For example it's really useful with the Sports genre, we don't want all the Sports subgenres in the genre cloud, but you still want to keep that information:

[
    "Baseball",
    "Basketball",
    "Billiards",
    "Bowling",
    "Boxing",
    "Cricket",
    "Fishing",
    "Fitness",
    "Football",
    "Golf",
    "Hockey",
    "Skateboarding",
    "Snowboarding/Skiing",
    "Soccer",
    "Surfing",
    "Tennis",
    "Track & Field",
    "Wrestling"
]
nikuda commented 7 years ago

It's a lot of work so let me know if there isn't anybody working on it already as far as you know because it will certainly take a lot of browsing and copy-pasting YAML lines

Just open up a PR when you've done a certain amount. Don't wait months to do the whole thing.

Would I be correct in assuming all of these could receive multiple values?

Yes

piranha commented 7 years ago

@tukkek maybe it makes sense waiting a bit for me to split up games.yaml in many files :-)

tukkek commented 7 years ago

@piranha good thing I was stalling then! I have been trying to take a good chunk of time to do it in one sitting but it's been hard this week. Once I manage to do so though it should be done in a single day (unless it takes me really more time than I am expecting, which should be a good bunch of hours). Can you tell me which bug should I be following to know once it's OK to go?

About genres/subgenres, I like the idea of having both but it's not only a bit more work to get right but it also will pollute the tag cloud. Having a huge "action" genre with hundreds of games that encompasses everything from platformers to beat-em-ups, shoot-em-ups, FPS, TPS, roguelites, action-rpgs, etc etc is not really valuable to anyone. I think having genres as single "tags" is better, which would also support "parent categories" as well in the future. Anyways, you guys haven't mentioned if you're all OK with me redefining the categories as I go along. Let me know, please.

piranha commented 7 years ago

It should be ok right now! Also, there is no need to do it in one sitting - I'd imagine in this case it'll be somewhat tedious task. :)

As for redefinition - I'm ok with that, since anything is good if it's done with good intentions and thoughtfulness. :)

tukkek commented 7 years ago

Thanks! I want to find a good chunk of time exactly because it's tedious - get it over with in one go, you know?

I just wanted to make sure about the redefinition because being such a big amount of data I'd hate for my effort to go to waste or something like that. As long as no one else here has something against it that's what I'll do then, to be the best of my ability. As I said before I think it makes more sense to define a taxonomy based on real data from the games anyway instead of trying to come up with something beforehand and making the games fit this arbitrary categorization.

As @cxong suggests I'm thinking about doing it with categories only (no subcategory) but I'll check here again before starting the work (probably next week) to see if any further discussion on the subject comes up.

On 15 October 2016 at 16:08, Alexander Solovyov notifications@github.com wrote:

It should be ok right now! Also, there is no need to do it in one sitting

  • I'd imagine in this case it'll be somewhat tedious task. :)

As for redefinition - I'm ok with that, since anything is good if it's done with good intentions and thoughtfulness. :)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/piranha/osgameclones/issues/366#issuecomment-254004535, or mute the thread https://github.com/notifications/unsubscribe-auth/ADpJMtXpuquooPFZptOds_8iMLN75x18ks5q0SS6gaJpZM4KSKEl .

tukkek commented 7 years ago

Finally got around to doing this. Now every game has one or more genres associated with it. I didn't have to actually download any clone/remake because most of that information could be found either by a Google search, lookup on the clone pages or watching short YouTube videos of the game in question. I'm pretty confident on the current state of things but a few errors could have slipped up since I'm not familiar with most of these games and just copy-pasted what I found from other sources.

I do have to say that I didn't put as much effort into completeness as I could have: for example, I didn't differentiate strategy and tactical games (tagging them either TBS or RTS). I didn't add "action" to all shooter/platform/arcade games either and I didn't use "flight" as a genre (looking back I think I should have, instead marking many flight games as simulation instead). In the same way, I used "simulation" instead of "sandbox" in a few titles. Anyway, I think the current result is accurate and these cases can be improved on a game-by-game basis moving ahead where and if necessary.

The reason for deliberately opting for vagueness over factual precision is because I didn't think the list would benefit a lot from having 1 MOBA game and 1 educational game, for example. The same with dividing strategy and tactics into different subgroups, etc. If anyone thinks it would be better to have more tags, even if the content becomes less organized then be my guest and change whatever you like - but I think that having tags with 10 or less games isn't really helpful at the moment.

In that sense you can see I also took some arbitrary decisions regarding some genres (like roguelike, horror and rhythm) because I though they stand out enough on their own even without many examples in the list. For example: merging survival horror games, even just 2 of them, to the "action" genre would be a disservice to the list.

You'll also notice that I didn't see the need to use the subgenre tag either. It seemed unnecessary to me while doing this work and also hard to input, if I had to discriminate "strategy" and TBS/RTS on a case by case basis. It's a lot of extra work without any real benefit: strategy fans can check TBS first and then RTS. I also didn't use the "theme" tags but I left what was already there without change.

My main goal was to have any single game tagged as only one genre but that clearly isn't possible with so many genre defying games out there. Here's a rundown of the categories so far (note that the % is relative to the total of games, not of genre tags since there can be multiple tags per game):

Genre Count Percent
arcade 100 21%
RTS 51 11%
puzzle 51 11%
simulation 49 10%
platform 48 10%
RPG 43 9%
action 43 9%
FPS 38 8%
adventure 35 7%
TBS 32 7%
shmup 25 5%
racing 21 4%
TPS 8 2%
MMORPG 6 1%
sports 6 1%
fighting 5 1%
roguelike 5 1%
rhythm 4 1%
horror 2 0.42%

I'll be creating a pull request soon after writing this.