veekun / pokedex

more than you ever wanted to know about Pokémon
MIT License
1.44k stars 637 forks source link

Rip Sun/Moon Data #198

Closed craig-clayton closed 7 years ago

craig-clayton commented 7 years ago

Is there any plan to integrate Sun and Moon data in to the database anytime soon? / Is there an appropriate way for people to contribute to such an effort?

EverOddish commented 7 years ago

I haven't contributed to this project before, but I'm willing to help out. I'll start things off with some info and questions.

It looks like this pk3DS tool has seen some active development lately for Gen 7. https://github.com/kwsch/pk3DS

Was this tool used in the past? Is it worth collaborating or talking to the developer?

What is considered a legitimate source for data? I haven't had much luck in getting homebrew running on my 3DS for dumping an image of Sun/Moon and decrypting it. There seem to be a handful of ROMs available for download around the internet, but I doubt this is legally legitimate and I question the safety (from viruses, etc.) of doing so.

What is the status of the 'scripts' directory in this repository? I read the description of them being one-shot scripts, but would they still be useful as examples or starting points?

I'm glad to help out in any way that I can!

eevee commented 7 years ago

I would love if more people could help, but this is low-level finnicky/tedious work and I have no idea how to convey all the necessary context. :) Maybe hop in IRC if you really want to give it a shot — irc://irc.veekun.com/veekun

The only legitimate source is the game itself, yes.

Part of our problem is that several people did all the datamining for previous games with a ton of messy ad-hoc scripts, most of which are now lost, many of which (such as the stuff in scripts) don't make much sense in isolation. Having everything blended together in one database doesn't help much, either; even once you have the data, cramming it into SQL alongside fifteen other games' data can be a pain.

I'm currently writing a repeatable set of extractors — that work is on the yaml branch, in pokedex.extract.*. The idea is to eventually rerip everything to independent YAML files, but for now I'm just trying to get Sun and Moon into YAML and then separately convert that to SQL.

I don't think pk3DS does anything we don't know about already, but I'll give it a look.

route1rodent commented 7 years ago

@eevee While there isn't a more reliable way to get the data directly ripped from the games, there may be no other choice than add the data manually... (and anyway we are already doing that via fixes in regular commits).

The problem with YAML is that won't perform well on large datasets, because you cannot read it line by line to get the full "item/row" of information.

To leverage the process of going through all CSV files, I've created a backend/admin project that imports the CSV files into SQLite and exports them back to CSV once you've finished editing with the admin UI.

Initially I meant to use it as a tool to migrate from your data structure to a brand new and more efficient one (where you could do things like sort pokemon by stat more easily), but I ditched the idea since it was consuming me a lot of time (it needs to be well rethought).

Here it is, in case you may find it interesting: https://github.com/metaunicorn/pokedex-admin. All instructions come in the readme. In case you need more help, just tweet me: @metaunicorn.

One handycap/feature is that the exporter will add IDs to all tables and will quote all the strings/sentences that have spaces.

Observations / modifications needed for S/M in the database / CSVs:

eevee commented 7 years ago

Good news! As of the most recent commit, we now have Sun/Moon items, abilities, moves, and Pokémon. Minus some exceptions, which I'm working on.

eevee commented 7 years ago

There are some obvious omissions (encounters...), but that's always the case, and this ticket wasn't specific... so since Sun and Moon are now actually on the goddamn website, I'm calling this done.