veekun / pokedex

more than you ever wanted to know about Pokémon
MIT License
1.44k stars 637 forks source link

Generation 2 encounters and encounter conditions? #287

Closed Hugo-Matias closed 4 years ago

Hugo-Matias commented 4 years ago

I noticed that, except for a few gift pokemons, gen. 2 encounters are missing. I searched online and found this source for encounter slot tables. Can't say for sure that the information was ripped straight from the rom but the data looks right to me and it can always be corrected if found wrong.

I'm scraping the website for the GSC versions that are missing. It is in french but they use pokemon numbers (species_id) for the name of the png sprites, this makes it very easy to parse the information for pokemon_id/levels with xpath selectors. I've already forked the repo and am in the process of updating the missing values for a pull request but I have a question about the schema of the db tables.

In gen II, encounter_conditions are relevant regarding the time of the day (and a few swarm encounters as well). In some places there is no distinction between morning, day and night, however in most occasions the time matters. Since there are 3 encounter_conditions for time of day, day being the default one, how should I deal with the encounter_condition_value_map? Take this situation as an example: (where 'xxxxxx' represents the corresponding encounter_id)

Pokemon A - appears on - Location A - at - Night encounter_condition_value_map: xxxxxx,5

Pokemon B - appears on - Location B - at - Morning and Day encounter_condition_value_map: xxxxxx,3 xxxxxx,4 or just? xxxxxx,3

Now what about the pokemons that appear regardless of time of day?

Pokemon C - appears on - Location C - at - Morning, Day and Night (i.e regular encounter) encounter_condition_value_map: xxxxxx,3 xxxxxx,4 xxxxxx,5 or nothing at all and just create a single 'encounters' record?

Should I create a record just when 1 or 2 conditions are applied? In my opinion this is the best approach because there is no need to create duplicate records on 'encounters' just to fill the 'encounter_condition_value_map' table but it really depends on how you use the database.

Looking at the website structure, I think that the rows that are colored grey should be represented as an 'encounter_condition_value_map' record and the blueish ones are just regular encounters that don't require conditions records.

Also, I'm not sure about the Mt. Silver 'location_areas' ids. Taking Bulbapedia as reference, I'm not 100% sure which id belongs to the Chambers. Some of the ids are for the gen IV version of the place but i'm having trouble to decipher the ids and the prose table isn't much help either. This is the interpretation I have of the ids. (Location ID of Mt. Silver is 82) Bulbapedia definition - Location Area ID - Identifier Exterior - 269 - outside 1F - 270 - 1f (there is also a '1f-top' identifier but i think it referes to Upper Mountainside of the gen 4 version or is it the one refering to the Chambers?) 2F - 263 - 2f Summit - 273 - top Chambers - ? - ?

magical commented 4 years ago

I noticed that, except for a few gift pokemons, gen. 2 encounters are missing. I searched online and found this source for encounter slot tables. Can't say for sure that the information was ripped straight from the rom but the data looks right to me and it can always be corrected if found wrong.

I would recommend getting the data from pokecrystal. (I was going to recommend pokegold too, but apparently it's unfinished.)

how should I deal with the encounter_condition_value_map?

It's kind of clunky, but iirc this is how it works: if all the values for a condition would lead to the same encounter, you can omit that condition. If any value would affect the encounter for a slot, then all its values must be represented. So basically, you need to include all three times of day or none of them. Yes, this can require duplicate rows in the encounters table.

I'd like to redesign the encounter schema so stuff like this is less awkward but that's a different issue.

magical commented 4 years ago

Related issue: https://github.com/veekun/pokedex/issues/201

Hugo-Matias commented 4 years ago

Ok, I think I'll use the french site tables for the initial parsing and then compare with the disassembled data for incorrect values. The tables are easier to read and to scrape, it can serve at the very least as a placeholder for future development.

About the encounter conditions, I would like to discuss it a bit more if you don't mind. It's important that I'm writing the information the right way and as clear as possible to prevent future headaches and misunderstandings.

As a practical example, please refer to the encounters on slot 1 of Route 1 and 2 of this table

The slot 1, with 30% rarity, of route 1 has a Pidgey at lvl 2, there is no condition indicated. These are the csv fields I will create. I will use line breaks when justified just as a means for better readability.

encounter_slots: (I already populated this table with the methods available in the site, I can post it in a pastebin just has a preliminar check to make sure everything is ok without the need for a pull request) 515,3,1,1,30 - id, version_group_id, encounter_method_id, slot, rarity

encounters: 50545,4,295,515,16,2,2 - id, version_id, location_area_id, encounter_slot_id(created previously), pokemon_id, min_level, max_level

For this type of encounter this is enough, right? Now for the Hoothoot that appears only at Night at the same place and slot. I would create:

encounters: 50546,4,295,515,163,2,2

encounter_condition_value_map: 50546,5 - encounter_id, encounter_condition_value_id (3-morning, 4-day, 5-night)

This is how I read the table relations but from what I understand from your post I should do something like this, right?

encounter_slots: 515,3,1,1,30 - (already created from the previous Pidgey)

516,3,1,1,0 - (for the times were Pokemons don't appear, this case Morning and Day) or 516,3,1,1, - (i.e. null instead of 0)

encounters: 50546,4,295,515,163,2,2 - (pokemon appears)

50547,4,295,516,163,2,2 - (pokemon doesn't appear, 0 rarity encounter slot) 50548,4,295,516,163,2,2 - (pokemon doesn't appear, 0 rarity encounter slot)

encounter_condition_value_map: 50546,5 - (pokemon appears, night)

50547,3 - (pokemon doesn't appear, morning) 50548,4 - (pokemon doesn't appear, day)

or perhaps...

encounters: 50546,4,295,515,163,2,2

50547,4,295,516,163,2,2 - (0 rarity encounter slot, but no duplicate)

encounter_condition_value_map: 50546,5

50547,3 50547,4 - (the single encounter with 2 condition values in a one to many relationship)

Is this correct? Looking at the global scale of the game isn't this too much redundant data in the end? Couldn't we just assume that if there is no record in encounter_condition_value_map the pokemon doesn't appear? I'm not a database expert so I might be seeing things a bit too simplistic but I mean, querying the database for SELECT encounters.rarity, with a WHERE clause of encounter_condition_value_map.encounter_condition_value_id = 5, a return of 0 rows isn't more or less the same as returning a bunch of rows with a 0 or null value? I guess that there is always a way to programmatically interpret a missing record as a not appearing pokemon but the important thing here is to fit the data to the database's needs and not the application's needs.

Now, what about Route 2, also slot 1. It has a Caterpie lvl 3, no condition, it will be treated as the Pidgey of Route 1. However, this place has 2 distinct conditions for the first slot. A Caterpie lvl 3 in the morning and a Hoothoot lvl 3 at night.

Let's work on the Caterpie, keeping in mind the previous entries already created.

encounters: 50548,4,296,515,10,3,3

50549,4,296,516,10,3,3

encounter_condition_value_map: 50548,3

50549,4 50549,5

This will say that there is a Caterpie in the morning but not at night or day. But there is a Hoothoot at night. Should they be represented for the 3 conditions values as well? Meaning, the focus should be on the Pokemon and not the Time on the particular place, for instance.

encounters: 50548,4,296,515,10,3,3 - Caterpie

50549,4,296,515,163,3,3 - Hoothoot

encounter_condition_value_map: 50548,3 - Caterpie in the "Morning slot" for the Route 2 place

50549,5 - Hoothoot in the "Night slot" for the Route 2 place

Instead of...

encounters: 50548,4,296,515,10,3,3 - Caterpie

50549,4,296,516,10,3,3 - Caterpie (0 rarity)

50550,4,296,515,163,3,3 - Hoothoot

50551,4,296,516,163,3,3 - Hoothoot (0 rarity)

encounter_condition_value_map: 50548,3

50549,4 50549,5

50550,5

50551,3 50551,4

Hopefully all this makes sense and we can discuss the topic a bit further and sorry about the long post but I really need to know for sure what you guys really need before I commit to the project.

magical commented 4 years ago

Er, no. You shouldn't ever have an encounter slot with a rarity of 0.

Encounter slots are like the columns of the table on the page you linked to. There are only a few of them, and they are filled with different pokemon in different locations under different conditions. Encounters are the cells of the table; they define which pokemon goes in which column. Conditions are... a third dimension of the table. Different pokemon might go in the slot at different times, or whatever, and the conditions tell you which pokemon appears when.

As a practical example, please refer to the encounters on slot 1 of Route 1 and 2 of this table The slot 1, with 30% rarity, of route 1 has a Pidgey at lvl 2, there is no condition indicated.

There is no condition listed in the table, but that's because the author of that page combined Route 1's morning and day encounters to make the table more compact.

Route 1, slot 1 should have three entries in the encounters table:

Now take a look at Route 1, slot 2. You get Rattata no matter what time it is, so it is safe to collapse it into a single encounter.

Hugo-Matias commented 4 years ago

Ah! I see, i think I got the conditions figured out now, thanks for the support. I was on the right path with the 'Morning slot' thing of my last example but not quite there yet. Just one last quick question.

I assume that it's ok to use a single 'encounters' record for cases that multiple conditions apply like the morning / day Pidgey you mentioned? For example: encounters: 50545,4,295,515,16,2,2 encounter_condition_value_map: 50545,3 50545,4

I'll work on the values now and send them for moderation as soon as I'm done fetching all the information.

magical commented 4 years ago

I assume that it's ok to use a single 'encounters' record for cases that multiple conditions apply like the morning / day Pidgey you mentioned?

Unfortunately not. The semantics we've defined for multiple conditions is that the pokemon appears when all conditions are met. So your example would mean that pidgey appears in that slot when it is both morning and day, which is impossible.

This is necessary for when multiple kinds of conditions can apply at the same time, and compete for the same slot. For example, time of day vs swarming, like this example from heartgold:

  id   │    version     │       location        │ location_area_id │     method      │        pokemon        │ slot │ rarity │ min_level │ max_level │        conditions        
───────┼────────────────┼───────────────────────┼──────────────────┼─────────────────┼───────────────────────┼──────┼────────┼───────────┼───────────┼──────────────────────────
 17653 │ heartgold      │ kanto-route-1         │              295 │ walk            │ pidgey                │    1 │     20 │         2 │         2 │ {swarm-no,time-day}
 17652 │ heartgold      │ kanto-route-1         │              295 │ walk            │ pidgey                │    1 │     20 │         2 │         2 │ {swarm-no,time-morning}
 17654 │ heartgold      │ kanto-route-1         │              295 │ walk            │ hoothoot              │    1 │     20 │         2 │         2 │ {swarm-no,time-night}
 17655 │ heartgold      │ kanto-route-1         │              295 │ walk            │ poochyena             │    1 │     20 │         2 │         2 │ {swarm-yes}

Poochyena appears in slot 1 when there is a swarm on Route 1, all day. When there isn't a swarm, and it's night, Hoothoot appears. When there isn't a swarm, and it is morning or day, Pidgey appears.

Hugo-Matias commented 4 years ago

Oh I see, I overlooked that situation of multiple conditions being applied to the same slot. In fact, i think it's easier to duplicate the encounters as I wont need to double check for duplicates and just create them as they are in the table. Thanks once again for the tips.

Hugo-Matias commented 4 years ago

Made a pull request here: https://github.com/veekun/pokedex/pull/289 If it's everything ok we can close this issue.

magical commented 4 years ago

Merged #289. Thanks.