Open simonameiler opened 4 years ago
Have you looked into the code of LitPop to spot the point where the assignment happens using a debugger? You can execute line by line and find where the problem happens.
I ran into this problem at some point; since I wanted to do the whole world, my solution was pragmatic and I just skipped the ISO_A3
codes that couldn't be found in the natural earth data. If you're going to fix the code, just tear it out and replace those terrifying iterators with geopandas.
If you're short on time then you could just log an error and return None
, which would be better than what it's doing now. The problem is that the if statement on line 937 never evaluates to true if the ISO code is not found in the data.
I am trying to fix this currently. So, what do you think should be the behaviour if the iso3 code is not found?
maybe skip the country, and log the failed iso3-codes somewhere? when I prepared global LitPop data, I wrote a script returning me a list of all failed codes / countries.
Yeah, I'd just use LOGGER.warning
if a code is skipped. In case the user has a need for a list of failed codes, it's manageable to just copy & paste that.
Is there possibly a fallback solution for countries that are not recognized? For example could they use only lit or only pop data?
Yeah, the natural earth table has a lot more IDs than just iso_a3; you could switch it over to adm0_a3; here's the rows where the two don't match: (maybe just put in a selector for which geodataframe col to use)
select admin, adm0_a3, iso_a3 from ne_10m_admin0_t where adm0_a3 <> iso_a3;
admin | adm0_a3 | iso_a3 |
---|---|---|
Dhekelia Sovereign Base Area | ESB | -99 |
Palestine | PSX | PSE |
South Sudan | SDS | SSD |
Somaliland | SOL | -99 |
France | FRA | -99 |
Western Sahara | SAH | ESH |
Norway | NOR | -99 |
Kosovo | KOS | -99 |
US Naval Base Guantanamo Bay | USG | -99 |
Northern Cyprus | CYN | -99 |
Baykonur Cosmodrome | KAB | -99 |
Cyprus No Mans Area | CNM | -99 |
Siachen Glacier | KAS | -99 |
Akrotiri Sovereign Base Area | WSB | -99 |
Ashmore and Cartier Islands | ATC | -99 |
Bajo Nuevo Bank (Petrel Is.) | BJN | -99 |
Serranilla Bank | SER | -99 |
Aland | ALD | ALA |
Clipperton Island | CLP | -99 |
Indian Ocean Territories | IOA | -99 |
Scarborough Reef | SCR | -99 |
Coral Sea Islands | CSI | -99 |
Spratly Islands | PGA | -99 |
ps: God their data is shitty, NOR without ISO? Their repo has PRs from five years ago, as far as I remember.
neither Norway nor France... that is very weird.
By the way, this sorta references the arbitrary region_ids currently mapped in the constants file, which I am afraid of touching, because it relies on multiple positional indexing. If the NE data get updated, it possibly goes belly-up.
I think part of the weirdness in the data (eg NOR and FRA) comes from the existence of unincorporated/disputed areas: France has Mayotte (and possibly others), Norway has Svalbard… Resulting in ISO oddities and other headaches:
I suspect the iso_a3
value of -99
is the result of faulty DB normalisation somewhere in NE tables… Possibly fixable by looking at other NE tables (or other sources, like Geonames).
Creating LitPop-entities yields an error for some countries. Upon calling the following lines of code I obtained the output pasted below:
It seems that the error might be related to the input rather than the function itself. In the above example, it's 'ANT' which causes the problem. Has anyone encountered similar problems using LitPop or does anyone have an idea of what the problem is? Maybe changes in the util.finance or util.coordinates cause the problem?