IFRCGo / go-frontend

MIT License
21 stars 5 forks source link

[PROD] Remove hardcoded appeal ingest country mapping #1672

Closed jhenshall closed 3 years ago

jhenshall commented 4 years ago

Issue

For the appeal ingest, the mapping for regional appeals to countries is currently hard coded. We should add the Apple codes for each country to the database and match using that instead.

Related feature

https://github.com/IFRCGo/go-api/issues/873 - outlines the issue for the API

1671 - example issue from this

cc @GregoryHorvath @batpad @tovari

jhenshall commented 3 years ago

This is not as simple as first suspected... Apple uses 2 distinct codes:

The trouble is that the divisions don't align between the two and IFRC clusters/regions could be split across multiple GEC locations. This means mapping the GO country table to GEC codes is a many-to-many relationship. e.g. MENA region = GEC: Middle East (MEA) AND North Africa (NAF).

@geohacker @GregoryHorvath - hope that's clear... any thoughts on best way to handle this so that two GEC codes could point to the same entity in the countries table? Perhaps the existence of the hardcoded lookup is making more sense afterall!

geohacker commented 3 years ago

@jhenshall I would suggest that we focus on removing the hardcoded region to country mapping in the appeal ingestion process as described in this ticket. From what @batpad explained to me, what we are trying to do in the appeals ingestion code is to map emergencies to a 'country' in GO, and in this particular case they are most certainly actual countries with record_type = 1.

This means, we would import the 3 digit code for regions into the region table.

@batpad @GregoryHorvath what are your thoughts?

GergiH commented 3 years ago

Hmm... I'm not totally sure I fully understand, but for now the logic is:

Right now we only map GEC codes to 1 specific country (ISO to be precise). I'd say we could simply add a gec_code field to the countries and then we can simply check against that when ingesting Appeals instead of mapping it to ISO, etc etc @geohacker .

If we need to have some many-to-many mapping logic, we can do that too I guess, but then we'd need a clear logic (something additional to the GEC_code, since I don't think the OSC_name is efficient [there are probably some name mismatches]) we can use on the data coming from Apple to match those with the countries... @jhenshall @geohacker .

cc @batpad

jhenshall commented 3 years ago

Nice one, thanks @geohacker @GregoryHorvath! I think I get it for the region and country mapping, but it's the sub-region/cluster mapping that is confusing me.

So it would be possible to have multiple GEC_codes mapped to a single country? e.g. Kenya might have three emergency locations mapped to it - AFR for Africa regional responses, EAF for Eastern Africa sub-region responses and KE for Kenya itself.

I have master list of GEC codes, so can just focus on mapping the 3-digit region and sub-region codes to the country they should be related to? And lets discuss the options for implementing.

GergiH commented 3 years ago

Oh I see now @jhenshall that you'd want to map multiple GEC the other way (than I thought). Hmm yea I think we should be able to do that with some many-to-one relation @geohacker, like having a model of just the GEC codes and then having the relation between those and the countries (?). Just my initial thought, there might be an easier solution.

geohacker commented 3 years ago

Interesting. Are the subregions and clusters in the Countries table already? If so then we can add a GEC code there and do a lookup. I defer to @batpad and @GregoryHorvath on the schema.

@jhenshall so does this mean we need to make changes the way appeals are processed and add a second level association to subregions and clusters? Sorry perhaps I'm not fully grasping the issue. @GregoryHorvath does doing another level of association seem possible? Would that have an impact on the API?

cc @batpad

batpad commented 3 years ago

@geohacker @jhenshall - yes, I think we'd do roughly what @GregoryHorvath said:

Let's create a separate model for GECCode or so - with two fields - code and country, which is a ForeignKey to Country. The GEC codes can be added there, and multiple GEC codes could refer to a single country. And during the appeal_ingest process we just need to look up the GECCode and get the country associated with it.

If this makes sense, what we'd want here @jhenshall I think is just a CSV / spreadsheet with GEC Code and Country ISO code.

Lemme know if that sounds good, @geohacker .

geohacker commented 3 years ago

@batpad this sounds good to me.

jhenshall commented 3 years ago

@geohacker - The data to be used for the new table is in the Google Drive folder: appeal_ingest_match.csv

geohacker commented 3 years ago

@jhenshall @GregoryHorvath we have added APPL codes to the database via https://github.com/IFRCGo/go-api/pull/1016. Now I'm hoping @GregoryHorvath can make the changes to the appeal ingestion script to use this data before we close here.

jhenshall commented 3 years ago

Done with March geo release!