veekun / pokedex

more than you ever wanted to know about Pokémon
MIT License
1.45k stars 636 forks source link

Schema Redesign Proposal #140

Closed dominicbarnes closed 9 years ago

dominicbarnes commented 9 years ago

In response to the "Proposals" section of this wiki page, I just wanted to reach out about something I was working on.

I was actually sitting down to try and transform the data from the current relational model into a document model. (using CouchDB in particular) My thought is that the semi-structured nature of a document data-store is more suited for a pokedex. (hopefully making interacting with all the underlying data much simpler)

I've been wanting to build a solid JSON API for pokedex data like this for a long time, and have put a lot of thought into how to do it with CouchDB specifically, so I thought I would start up a discussion here.

eevee commented 9 years ago

I expect you'll have exactly the same problems. Pokémon data is pretty well-structured — the big issues are with avoiding duplication. A document model doesn't do much to help you there.

I wrote some concrete issues down here: https://gist.github.com/eevee/6a257a9d42400e2d03f9

How would you solve these?

dominicbarnes commented 9 years ago

If switching the entire DBMS is not really what you had in mind, I totally understand that. Currently, I'm going to just read from a SQLite database (built from this script) and pipe it into a CouchDB database using node.js scripts, so the tooling I have in mind is completely different from what you have now.

With that in mind, I could really use some help wrapping my mind around how this database should be queried in order for me to accomplish my goals. (was wondering if you had any other docs on the schema, or perhaps some "views" for these tables to help illustrate the joins needed for various queries)

dominicbarnes commented 9 years ago

@eevee I'll read through that gist in great detail before I get too detailed in my explanations.

Off the top of my head, I was thinking of keeping the data in silos for each generation and spin-off. Where there are variations within the generations, (eg: red vs blue vs yellow) the documents within those collections would break down inline.

Generally speaking, document data-stores make the trade-off of allowing duplication in order to keep the documents simpler and largely self-contained. That being said, I think the trade-off could pay off here. (with my goal being a simpler query model)

Since I was already working on this, I can try and whip up a proof-of-concept.

quadrupleslap commented 9 years ago

Just saying, a document-based database with an exposed json api.co already exists, over at PokéAPI. You might want to check that out, although a lot of stuff is out-of-date, and it has these hideous resource_uri properties everywhere.

dominicbarnes commented 9 years ago

Yeah, that is actually where I started. fwiw, @phalt is working on v2 of pokeapi, and mentioned using the pokedex here as the basis for it. (which is what directed me here)

I think I may venture off on my own for the time-being to flesh out this idea of mine. Carry on :)