covidatlas / li

Next-generation serverless crawler for COVID-19 data
Apache License 2.0
57 stars 33 forks source link

Canonical key to identify locations #467

Closed jzohrab closed 3 years ago

jzohrab commented 3 years ago

Original issue https://github.com/covidatlas/coronadatascraper/issues/1051, transferred here on Wednesday Jun 10, 2020 at 23:47 GMT


I'm looking at locations.json and am trying to figure out what to use as a primary key to canonically identify locations in my consumer library.

It seems like the name field serves as a primary key in practice, but I'm worried it might change because its's a human-readable, localized name.

I was hoping I could use featureId as a key – it's particularly useful because I can easily obtain names in many language for each location. But 355 out of 4414 locations don't have a featureId:

Locations with missing featureId ``` countyId | stateId | countyId | name iso1:ES | iso2:ES-AN | | Andalusia, Spain iso1:ES | iso2:ES-AR | | Aragon, Spain iso1:ES | iso2:ES-AS | | Asturias, Spain iso1:ES | iso2:ES-IB | | Balearic Islands, Spain iso1:ES | iso2:ES-CN | | Canary Islands, Spain iso1:ES | iso2:ES-CB | | Cantabria, Spain iso1:ES | iso2:ES-CM | | Castile-La Mancha, Spain iso1:ES | iso2:ES-CL | | Castile and León, Spain iso1:ES | iso2:ES-CT | | Catalonia, Spain iso1:ES | iso2:ES-CE | | Ceuta, Spain iso1:ES | iso2:ES-VC | | Valencian Community, Spain iso1:ES | iso2:ES-EX | | Extremadura, Spain iso1:ES | iso2:ES-GA | | Galicia, Spain iso1:ES | iso2:ES-MD | | Community of Madrid, Spain iso1:ES | iso2:ES-ML | | Melilla, Spain iso1:ES | iso2:ES-MC | | Region of Murcia, Spain iso1:ES | iso2:ES-NC | | Navarre, Spain iso1:ES | iso2:ES-PV | | Basque Country, Spain iso1:ES | iso2:ES-RI | | Rioja, Spain iso1:PA | iso2:PA-8 | | Panamá, Panama iso1:PA | iso2:PA-1 | | Bocas del Toro, Panama iso1:PA | iso2:PA-4 | | Chiriquí, Panama iso1:PA | iso2:PA-2 | | Coclé, Panama iso1:PA | iso2:PA-3 | | Colón, Panama iso1:PA | iso2:PA-5 | | Darién, Panama iso1:PA | iso2:PA-EM | | Comarca Emberá-Wounaan, Panama iso1:PA | iso2:PA-KY | | Comarca Guna Yala, Panama iso1:PA | iso2:PA-6 | | Herrera, Panama iso1:PA | iso2:PA-7 | | Los Santos, Panama iso1:PA | iso2:PA-NB | | Comarca Ngäbe Buglé, Panama iso1:PA | iso2:PA-10 | | Panamá Oeste, Panama iso1:PA | iso2:PA-9 | | Veraguas, Panama iso1:GP | | | Guadeloupe iso1:MQ | | | Martinique iso1:GF | | | French Guiana iso1:RE | | | Réunion iso1:YT | | | Mayotte iso1:FX | iso2:FR-GES | iso2:FR-10 | Aube, Grand Est, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-11 | Aude, Occitania, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-12 | Aveyron, Occitania, Metropolitan France iso1:FX | iso2:FR-PAC | iso2:FR-13 | Bouches-du-Rhône, Provence-Alpes-Côte d'Azur, Metropolitan France iso1:FX | iso2:FR-NOR | iso2:FR-14 | Calvados, Normandy, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-15 | Cantal, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-16 | Charente, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-17 | Charente-Maritime, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-CVL | iso2:FR-18 | Cher, Centre-Loire Valley, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-19 | Corrèze, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-BFC | iso2:FR-21 | Côte-d'Or, Bourgogne-Franche-Comté, Metropolitan France iso1:FX | iso2:FR-BRE | iso2:FR-22 | Côtes-d'Armor, Brittany, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-23 | Creuse, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-24 | Dordogne, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-BFC | iso2:FR-25 | Doubs, Bourgogne-Franche-Comté, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-26 | Drôme, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-NOR | iso2:FR-27 | Eure, Normandy, Metropolitan France iso1:FX | iso2:FR-CVL | iso2:FR-28 | Eure-et-Loir, Centre-Loire Valley, Metropolitan France iso1:FX | iso2:FR-BRE | iso2:FR-29 | Finistère, Brittany, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-30 | Gard, Occitania, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-31 | Haute-Garonne, Occitania, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-32 | Gers, Occitania, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-33 | Gironde, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-34 | Hérault, Occitania, Metropolitan France iso1:FX | iso2:FR-BRE | iso2:FR-35 | Ille-et-Vilaine, Brittany, Metropolitan France iso1:FX | iso2:FR-CVL | iso2:FR-36 | Indre, Centre-Loire Valley, Metropolitan France iso1:FX | iso2:FR-CVL | iso2:FR-37 | Indre-et-Loire, Centre-Loire Valley, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-38 | Isère, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-BFC | iso2:FR-39 | Jura, Bourgogne-Franche-Comté, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-40 | Landes, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-CVL | iso2:FR-41 | Loir-et-Cher, Centre-Loire Valley, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-42 | Loire, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-43 | Haute-Loire, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-PDL | iso2:FR-44 | Loire-Atlantique, Pays de la Loire, Metropolitan France iso1:FX | iso2:FR-CVL | iso2:FR-45 | Loiret, Centre-Loire Valley, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-46 | Lot, Occitania, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-47 | Lot-et-Garonne, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-48 | Lozère, Occitania, Metropolitan France iso1:FX | iso2:FR-PDL | iso2:FR-49 | Maine-et-Loire, Pays de la Loire, Metropolitan France iso1:FX | iso2:FR-NOR | iso2:FR-50 | Manche, Normandy, Metropolitan France iso1:FX | iso2:FR-GES | iso2:FR-51 | Marne, Grand Est, Metropolitan France iso1:FX | iso2:FR-GES | iso2:FR-52 | Haute-Marne, Grand Est, Metropolitan France iso1:FX | iso2:FR-PDL | iso2:FR-53 | Mayenne, Pays de la Loire, Metropolitan France iso1:FX | iso2:FR-GES | iso2:FR-54 | Meurthe-et-Moselle, Grand Est, Metropolitan France iso1:FX | iso2:FR-GES | iso2:FR-55 | Meuse, Grand Est, Metropolitan France iso1:FX | iso2:FR-BRE | iso2:FR-56 | Morbihan, Brittany, Metropolitan France iso1:FX | iso2:FR-GES | iso2:FR-57 | Moselle, Grand Est, Metropolitan France iso1:FX | iso2:FR-BFC | iso2:FR-58 | Nièvre, Bourgogne-Franche-Comté, Metropolitan France iso1:FX | iso2:FR-HDF | iso2:FR-59 | Nord, Hauts-de-France, Metropolitan France iso1:FX | iso2:FR-HDF | iso2:FR-60 | Oise, Hauts-de-France, Metropolitan France iso1:FX | iso2:FR-NOR | iso2:FR-61 | Orne, Normandy, Metropolitan France iso1:FX | iso2:FR-HDF | iso2:FR-62 | Pas-de-Calais, Hauts-de-France, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-63 | Puy-de-Dôme, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-64 | Pyrénées-Atlantiques, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-65 | Hautespyrenees, Occitania, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-66 | Pyrénées-Orientales, Occitania, Metropolitan France iso1:FX | iso2:FR-GES | iso2:FR-67 | Bas-Rhin, Grand Est, Metropolitan France iso1:FX | iso2:FR-GES | iso2:FR-68 | Haut-Rhin, Grand Est, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-69 | Rhône, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-BFC | iso2:FR-70 | Haute-Saône, Bourgogne-Franche-Comté, Metropolitan France iso1:FX | iso2:FR-BFC | iso2:FR-71 | Saône-et-Loire, Bourgogne-Franche-Comté, Metropolitan France iso1:FX | iso2:FR-PDL | iso2:FR-72 | Sarthe, Pays de la Loire, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-73 | Savoy, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-74 | Upper Savoy, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-IDF | iso2:FR-75 | Paris, Ile-de-France, Metropolitan France iso1:FX | iso2:FR-NOR | iso2:FR-76 | Seine-Maritime, Normandy, Metropolitan France iso1:FX | iso2:FR-IDF | iso2:FR-77 | Seine-et-Marne, Ile-de-France, Metropolitan France iso1:FX | iso2:FR-IDF | iso2:FR-78 | Yvelines, Ile-de-France, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-79 | Deux-Sèvres, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-HDF | iso2:FR-80 | Somme, Hauts-de-France, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-81 | Tarn, Occitania, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-82 | Tarn-et-Garonne, Occitania, Metropolitan France iso1:FX | iso2:FR-PAC | iso2:FR-83 | Var, Provence-Alpes-Côte d'Azur, Metropolitan France iso1:FX | iso2:FR-PAC | iso2:FR-84 | Vaucluse, Provence-Alpes-Côte d'Azur, Metropolitan France iso1:FX | iso2:FR-PDL | iso2:FR-85 | Vendée, Pays de la Loire, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-86 | Vienne, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-NAQ | iso2:FR-87 | Haute-Vienne, New Aquitaine, Metropolitan France iso1:FX | iso2:FR-GES | iso2:FR-88 | Vosges, Grand Est, Metropolitan France iso1:FX | iso2:FR-BFC | iso2:FR-89 | Yonne, Bourgogne-Franche-Comté, Metropolitan France iso1:FX | iso2:FR-BFC | iso2:FR-90 | Territoire-de-Belfort, Bourgogne-Franche-Comté, Metropolitan France iso1:FX | iso2:FR-IDF | iso2:FR-91 | Essonne, Ile-de-France, Metropolitan France iso1:FX | iso2:FR-IDF | iso2:FR-92 | Hauts-de-Seine, Ile-de-France, Metropolitan France iso1:FX | iso2:FR-IDF | iso2:FR-93 | Seine-Saint-Denis, Ile-de-France, Metropolitan France iso1:FX | iso2:FR-IDF | iso2:FR-94 | Val-de-Marne, Ile-de-France, Metropolitan France iso1:FX | iso2:FR-IDF | iso2:FR-95 | Val-d'Oise, Ile-de-France, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-01 | Ain, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-HDF | iso2:FR-02 | Aisne, Hauts-de-France, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-03 | Allier, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-PAC | iso2:FR-04 | Alpes-de-Haute-Provence, Provence-Alpes-Côte d'Azur, Metropolitan France iso1:FX | iso2:FR-PAC | iso2:FR-05 | Hautes-Alpes, Provence-Alpes-Côte d'Azur, Metropolitan France iso1:FX | iso2:FR-PAC | iso2:FR-06 | Maritime Alps, Provence-Alpes-Côte d'Azur, Metropolitan France iso1:FX | iso2:FR-ARA | iso2:FR-07 | Ardèche, Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-GES | iso2:FR-08 | Ardennes, Grand Est, Metropolitan France iso1:FX | iso2:FR-OCC | iso2:FR-09 | Ariège, Occitania, Metropolitan France iso1:FX | iso2:FR-COR | iso2:FR-2A | South Corsica, Corsica, Metropolitan France iso1:FX | iso2:FR-COR | iso2:FR-2B | Haute-Corse, Corsica, Metropolitan France iso1:FX | iso2:FR-GES | | Grand Est, Metropolitan France iso1:FX | iso2:FR-OCC | | Occitania, Metropolitan France iso1:FX | iso2:FR-PAC | | Provence-Alpes-Côte d'Azur, Metropolitan France iso1:FX | iso2:FR-NOR | | Normandy, Metropolitan France iso1:FX | iso2:FR-ARA | | Auvergne-Rhône-Alpes, Metropolitan France iso1:FX | iso2:FR-NAQ | | New Aquitaine, Metropolitan France iso1:FX | iso2:FR-CVL | | Centre-Loire Valley, Metropolitan France iso1:FX | iso2:FR-BFC | | Bourgogne-Franche-Comté, Metropolitan France iso1:FX | iso2:FR-BRE | | Brittany, Metropolitan France iso1:FX | iso2:FR-PDL | | Pays de la Loire, Metropolitan France iso1:FX | iso2:FR-HDF | | Hauts-de-France, Metropolitan France iso1:FX | iso2:FR-IDF | | Ile-de-France, Metropolitan France iso1:FX | iso2:FR-COR | | Corsica, Metropolitan France iso1:FX | | | Metropolitan France iso1:US | iso2:US-MA | fips:25007+fips:25019 | Dukes County, Nantucket County, Massachusetts, United States iso1:GB | iso2:GB-SCT | iso2:GB-SCB | Scottish Borders, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-DGY | Dumfries and Galloway, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-FIF | Fife, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-CLK+iso2:GB-FAL+iso2:GB-STG | Clackmannanshire, Falkirk, Stirling, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-ABD+iso2:GB-ABE+iso2:GB-MRY | Aberdeenshire, Aberdeen City, Moray, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-GLG+iso2:GB-EDU+iso2:GB-ERW+iso2:GB-IVC+iso2:GB-RFW+iso2:GB-WDU | Glasgow City, East Dunbartonshire, East Renfrewshire, Inverclyde, Renfrewshire, West Dunbartonshire, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-HLD+iso2:GB-AGB | Highland, Argyll and Bute, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-NLK+iso2:GB-SLK | North Lanarkshire, South Lanarkshire, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-EDH+iso2:GB-ELN+iso2:GB-MLN+iso2:GB-WLN | City of Edinburgh, East Lothian, Midlothian, West Lothian, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-ZET | Shetland Islands, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-ANS+iso2:GB-DND+iso2:GB-PKN | Angus, Dundee City, Perth and Kinross, Scotland, United Kingdom iso1:US | iso2:US-AK | fips:02020 | Anchorage Economic Region, AK, USA iso1:US | iso2:US-AK | fips:02261+fips:02150+fips:02122 | Gulf Coast Economic Region, AK, USA iso1:US | iso2:US-AK | fips:02068+fips:02290+fips:02240+fips:02090 | Interior Economic Region, AK, USA iso1:US | iso2:US-AK | fips:02170 | Mat-Su Economic Region, AK, USA iso1:US | iso2:US-AK | fips:02188+fips:02180+fips:02185 | Northern Economic Region, AK, USA iso1:US | iso2:US-AK | fips:02282+fips:02230+fips:02105+fips:02275+fips:02100+fips:02195+fips:02198+fips:02220+fips:02130+fips:02110 | Southeast Economic Region, AK, USA iso1:US | iso2:US-AK | fips:02060+fips:02164+fips:02013+fips:02070+fips:02016+fips:02158+fips:02050 | Southwest Economic Region, AK, USA iso1:RU | iso2:RU-MOW | | Moscow, Russia iso1:RU | iso2:RU-MOS | | Moscow Oblast, Russia iso1:RU | iso2:RU-SPE | | Saint Petersburg, Russia iso1:RU | iso2:RU-SAM | | Samara Oblast, Russia iso1:RU | iso2:RU-SA | | Sakha Republic, Russia iso1:RU | iso2:RU-SVE | | Sverdlovsk Oblast, Russia iso1:RU | iso2:RU-KGD | | Kaliningrad, Russia iso1:RU | iso2:RU-KIR | | Kirov Oblast, Russia iso1:RU | iso2:RU-NVS | | Novosibirsk Oblast, Russia iso1:RU | iso2:RU-KYA | | Krasnoyarsk Krai, Russia iso1:RU | iso2:RU-TAM | | Tambov Oblast, Russia iso1:RU | iso2:RU-LIP | | Lipetsk Oblast, Russia iso1:RU | iso2:RU-TVE | | Tver Oblast, Russia iso1:RU | iso2:RU-KHA | | Khabarovsk Krai, Russia iso1:RU | iso2:RU-TYU | | Tyumen Oblast, Russia iso1:RU | iso2:RU-TUL | | Tula Oblast, Russia iso1:RU | iso2:RU-PER | | Perm Krai, Russia iso1:RU | iso2:RU-NIZ | | Nizhny Novgorod Oblast, Russia iso1:RU | iso2:RU-KDA | | Krasnodar Krai, Russia iso1:RU | iso2:RU-VOR | | Voronezh Oblast, Russia iso1:RU | iso2:RU-KEM | | Kemerovo Oblast, Russia iso1:RU | iso2:RU-KK | | Republic of Khakassia, Russia iso1:RU | iso2:RU-MUR | | Murmansk Oblast, Russia iso1:RU | iso2:RU-KO | | Komi Republic, Russia iso1:RU | iso2:RU-KLU | | Kaluga Oblast, Russia iso1:RU | iso2:RU-IVA | | Ivanovo Oblast, Russia iso1:RU | iso2:RU-ZAB | | Zabaykalsky Krai, Russia iso1:RU | iso2:RU-TOM | | Tomsk Oblast, Russia iso1:RU | iso2:RU-ARK | | Arkhangelsk Oblast, Russia iso1:RU | iso2:RU-RYA | | Ryazan Oblast, Russia iso1:RU | iso2:RU-ULY | | Ulyanovsk Oblast, Russia iso1:RU | iso2:RU-YAR | | Yaroslavl Oblast, Russia iso1:RU | iso2:RU-PNZ | | Penza Oblast, Russia iso1:RU | iso2:RU-BEL | | Belgorod Oblast, Russia iso1:RU | iso2:RU-KHM | | Khanty-Mansiysk Autonomous Okrug – Ugra, Russia iso1:RU | iso2:RU-LEN | | Leningrad oblast, Russia iso1:RU | iso2:RU-ORE | | Orenburg Oblast, Russia iso1:RU | iso2:RU-SAR | | Saratov Oblast, Russia iso1:RU | iso2:RU-TA | | Tatarstan, Russia iso1:RU | iso2:RU-KGN | | Kurgan Oblast, Russia iso1:RU | iso2:RU-KB | | Kabardino-Balkaria, Russia iso1:RU | iso2:RU-CHE | | Chelyabinsk Oblast, Russia iso1:RU | iso2:RU-STA | | Stavropol Krai, Russia iso1:RU | iso2:RU-BRY | | Bryansk Oblast, Russia iso1:RU | iso2:RU-UD | | Udmurtia, Russia iso1:RU | iso2:RU-NGR | | Novgorod Oblast, Russia iso1:RU | iso2:RU-CR | | Republic of Crimea, Russia iso1:RU | iso2:RU-BA | | Bashkortostan, Russia iso1:RU | iso2:RU-CE | | Chechen Republic, Russia iso1:RU | iso2:RU-PRI | | Primorsky Krai, Russia iso1:RU | iso2:RU-VGG | | Volgograd Oblast, Russia iso1:RU | iso2:RU-ORL | | Oryol Oblast, Russia iso1:RU | iso2:RU-PSK | | Pskov Oblast, Russia iso1:RU | iso2:RU-ROS | | Rostov Oblast, Russia iso1:RU | iso2:RU-BU | | Buryatia, Russia iso1:RU | iso2:RU-MO | | Republic of Mordovia, Russia iso1:RU | iso2:RU-DA | | Republic of Dagestan, Russia iso1:RU | iso2:RU-SAK | | Sakhalin Oblast, Russia iso1:RU | iso2:RU-KOS | | Kostroma Oblast, Russia iso1:RU | iso2:RU-SMO | | Smolensk Oblast, Russia iso1:RU | iso2:RU-AD | | Republic of Adygea, Russia iso1:GB | iso2:GB-SCT | iso2:GB-NAY+iso2:GB-EAY+iso2:GB-SAY | North Ayrshire, East Ayrshire, South Ayrshire, Scotland, United Kingdom iso1:RU | iso2:RU-OMS | | Omsk Oblast, Russia iso1:RU | iso2:RU-IRK | | Irkutsk Oblast, Russia iso1:RU | iso2:RU-AMU | | Amur Oblast, Russia iso1:US | iso2:US-RI | fips:44001 | Barrington, Bristol County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44001 | Bristol, Bristol County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Burrillville, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Central Falls, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44009 | Charlestown, Washington County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44003 | Coventry, Kent County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Cranston, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Cumberland, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44003 | East Greenwich, Kent County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | East Providence, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44009 | Exeter, Washington County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Foster, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Glocester, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44009 | Hopkinton, Washington County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44005 | Jamestown, Newport County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Johnston, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Lincoln, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44005 | Little Compton, Newport County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44005 | Middletown, Newport County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44009 | Narragansett, Washington County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44009 | New Shoreham, Washington County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44005 | Newport, Newport County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44009 | North Kingstown, Washington County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | North Providence, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | North Smithfield, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Pawtucket, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44005 | Portsmouth, Newport County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Providence, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44009 | Richmond, Washington County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Scituate, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Smithfield, Providence County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44009 | South Kingstown, Washington County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44005 | Tiverton, Newport County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44001 | Warren, Bristol County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44003 | Warwick, Kent County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44003 | West Greenwich, Kent County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44003 | West Warwick, Kent County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44009 | Westerly, Washington County, Rhode Island, United States iso1:US | iso2:US-RI | fips:44007 | Woonsocket, Providence County, Rhode Island, United States iso1:BR | iso2:BR-AC | | Acre, Brazil iso1:BR | iso2:BR-AL | | Alagoas, Brazil iso1:BR | iso2:BR-AP | | Amapá, Brazil iso1:BR | iso2:BR-AM | | Amazonas, Brazil iso1:BR | iso2:BR-BA | | Bahia, Brazil iso1:BR | iso2:BR-CE | | Ceará, Brazil iso1:BR | iso2:BR-DF | | Federal District, Brazil iso1:BR | iso2:BR-ES | | Espírito Santo, Brazil iso1:BR | iso2:BR-GO | | Goiás, Brazil iso1:BR | iso2:BR-MA | | Maranhão, Brazil iso1:BR | iso2:BR-MT | | Mato Grosso, Brazil iso1:BR | iso2:BR-MS | | Mato Grosso do Sul, Brazil iso1:BR | iso2:BR-MG | | Minas Gerais, Brazil iso1:BR | iso2:BR-PR | | Paraná, Brazil iso1:BR | iso2:BR-PB | | Paraíba, Brazil iso1:BR | iso2:BR-PA | | Pará, Brazil iso1:BR | iso2:BR-PE | | Pernambuco, Brazil iso1:BR | iso2:BR-PI | | Piauí, Brazil iso1:BR | iso2:BR-RN | | Rio Grande do Norte, Brazil iso1:BR | iso2:BR-RS | | Rio Grande do Sul, Brazil iso1:BR | iso2:BR-RJ | | Rio de Janeiro, Brazil iso1:BR | iso2:BR-RO | | Rondônia, Brazil iso1:BR | iso2:BR-RR | | Roraima, Brazil iso1:BR | iso2:BR-SC | | Santa Catarina, Brazil iso1:BR | iso2:BR-SE | | Sergipe, Brazil iso1:BR | iso2:BR-SP | | São Paulo, Brazil iso1:BR | iso2:BR-TO | | Tocantins, Brazil iso1:RU | iso2:RU-ALT | | Altai Krai, Russia iso1:RU | iso2:RU-VLA | | Vladimir Oblast, Russia iso1:RU | iso2:RU-VLG | | Vologda Oblast, Russia iso1:RU | iso2:RU-KL | | Republic of Kalmykia, Russia iso1:RU | iso2:RU-ME | | Mari El, Russia iso1:RU | iso2:RU-CU | | Chuvashia, Russia iso1:IN | iso2:IN-AP | | Andhra Pradesh, India iso1:IN | iso2:IN-AN | | Andaman and Nicobar Islands, India iso1:IN | iso2:IN-BR | | Bihar, India iso1:IN | iso2:IN-CH | | Chandigarh, India iso1:IN | iso2:IN-CT | | Chhattisgarh, India iso1:IN | iso2:IN-DL | | Delhi, India iso1:IN | iso2:IN-GA | | Goa, India iso1:IN | iso2:IN-GJ | | Gujarat, India iso1:IN | iso2:IN-HR | | Haryana, India iso1:IN | iso2:IN-HP | | Himachal Pradesh, India iso1:IN | iso2:IN-JK | | Jammu and Kashmir, India iso1:IN | iso2:IN-KA | | Karnataka, India iso1:IN | iso2:IN-KL | | Kerala, India iso1:IN | iso2:IN-LA | | Ladakh, India iso1:IN | iso2:IN-MP | | Madhya Pradesh, India iso1:IN | iso2:IN-MH | | Maharashtra, India iso1:IN | iso2:IN-MN | | Manipur, India iso1:IN | iso2:IN-MZ | | Mizoram, India iso1:IN | iso2:IN-OR | | Odisha, India iso1:IN | iso2:IN-PY | | Puducherry, India iso1:IN | iso2:IN-PB | | Punjab, India iso1:IN | iso2:IN-RJ | | Rajasthan, India iso1:IN | iso2:IN-TN | | Tamil Nadu, India iso1:IN | iso2:IN-TG | | Telangana, India iso1:IN | iso2:IN-UT | | Uttarakhand, India iso1:IN | iso2:IN-UP | | Uttar Pradesh, India iso1:IN | iso2:IN-WB | | West Bengal, India iso1:RU | iso2:RU-AST | | Astrakhan Oblast, Russia iso1:RU | iso2:RU-MAG | | Magadan Oblast, Russia iso1:RU | iso2:RU-SEV | | Sevastopol, Russia iso1:GB | iso2:GB-SCT | iso2:GB-ORK | Orkney Islands, Scotland, United Kingdom iso1:GB | iso2:GB-SCT | iso2:GB-ELS | Western Isles, Scotland, United Kingdom iso1:IN | iso2:IN-AS | | Assam, India iso1:IN | iso2:IN-JH | | Jharkhand, India iso1:RU | iso2:RU-KRS | | Kursk Oblast, Russia iso1:RU | iso2:RU-SE | | Republic of North Ossetia-Alania, Russia iso1:IN | iso2:IN-AR | | Arunachal Pradesh, India iso1:RU | iso2:RU-YAN | | Yamalo-Nenets Autonomous Okrug, Russia iso1:RU | iso2:RU-YEV | | Jewish Autonomous Oblast, Russia iso1:RU | iso2:RU-IN | | Ingushetia, Russia iso1:RU | iso2:RU-KAM | | Kamchatka Krai, Russia iso1:IN | iso2:IN-TR | | Tripura, India iso1:RU | iso2:RU-KR | | Republic of Karelia, Russia iso1:RU | iso2:RU-KC | | Karachay-Cherkessia, Russia iso1:RU | iso2:RU-TY | | Tuva Republic, Russia iso1:IN | iso2:IN-NL | | Nagaland, India iso1:IN | iso2:IN-ML | | Meghalaya, India iso1:RU | iso2:RU-NEN | | Nenets Autonomous Okrug, Russia iso1:RU | iso2:RU-CHU | | Chukotka Autonomous Okrug, Russia iso1:US | iso2:US-MI | fips:26163 | Detroit City, Wayne County, Michigan, United States iso1:RU | iso2:RU-AL | | Altai Republic, Russia iso1:IN | iso2:IN-DN | | Dadra and Nagar Haveli, India ```

I noticed that there is a bunch of logic in src/events/processor/find-features/index.js to generate featureId fields, so I'm assuming this could possibly be fixed to cover 100% of locations.

Finally, I tried using [countryId, stateId, countyId] triplets, but they don't uniquely identify locations either:

```js > require('./locations.json').length 4414 > new Set(require('./locations.json').map((l) => `${l.countryId},${l.stateId},${l.countyId}`)).size 4372 ```

Any thoughts?

jzohrab commented 3 years ago

(Transferred comment)

Hi @joliss - hm, the triplets should actually be correct, b/c together they should identify the location. Moreover, in Li, we're still using those triplets, so they'll likely be the best ID. Can you let me know what triplets have dups, and perhaps give me some more detail about them?

jzohrab commented 3 years ago

(Transferred comment)

Sure, here's the duplicates:

triplets.js ```js let locations = require('./locations.json') function getKey({ countryId, stateId, countyId }) { return `${countryId || ''},${stateId || ''},${countyId || ''}` } function summarizeLocation(loc) { return `[${loc.level}] ${loc.name}` } let locationsByTriplet = new Map for (let location of locations) { let key = getKey(location) if (!locationsByTriplet.has(key)) { locationsByTriplet.set(key, []) } locationsByTriplet.get(key).push(location) } let duplicateTriplets = {} for (let [key, locs] of locationsByTriplet) { if (locs.length > 1) { duplicateTriplets[key] = locs.map(summarizeLocation) } } console.log(JSON.stringify(duplicateTriplets, null, 2)) ```

Output:

{
  "iso1:US,iso2:US-AK,fips:02020": [
    "[county] Anchorage Municipality, Alaska, United States",
    "[county] Anchorage Economic Region, AK, USA"
  ],
  "iso1:US,iso2:US-AK,fips:02170": [
    "[county] Matanuska-Susitna Borough, Alaska, United States",
    "[county] Mat-Su Economic Region, AK, USA"
  ],
  "iso1:US,iso2:US-MI,fips:26163": [
    "[county] Wayne County, Michigan, United States",
    "[city] Detroit City, Wayne County, Michigan, United States"
  ],
  "iso1:US,iso2:US-RI,fips:44001": [
    "[county] Bristol County, Rhode Island, United States",
    "[city] Barrington, Bristol County, Rhode Island, United States",
    "[city] Bristol, Bristol County, Rhode Island, United States",
    "[city] Warren, Bristol County, Rhode Island, United States"
  ],
  "iso1:US,iso2:US-RI,fips:44003": [
    "[county] Kent County, Rhode Island, United States",
    "[city] Coventry, Kent County, Rhode Island, United States",
    "[city] East Greenwich, Kent County, Rhode Island, United States",
    "[city] Warwick, Kent County, Rhode Island, United States",
    "[city] West Greenwich, Kent County, Rhode Island, United States",
    "[city] West Warwick, Kent County, Rhode Island, United States"
  ],
  "iso1:US,iso2:US-RI,fips:44005": [
    "[county] Newport County, Rhode Island, United States",
    "[city] Jamestown, Newport County, Rhode Island, United States",
    "[city] Little Compton, Newport County, Rhode Island, United States",
    "[city] Middletown, Newport County, Rhode Island, United States",
    "[city] Newport, Newport County, Rhode Island, United States",
    "[city] Portsmouth, Newport County, Rhode Island, United States",
    "[city] Tiverton, Newport County, Rhode Island, United States"
  ],
  "iso1:US,iso2:US-RI,fips:44007": [
    "[county] Providence County, Rhode Island, United States",
    "[city] Burrillville, Providence County, Rhode Island, United States",
    "[city] Central Falls, Providence County, Rhode Island, United States",
    "[city] Cranston, Providence County, Rhode Island, United States",
    "[city] Cumberland, Providence County, Rhode Island, United States",
    "[city] East Providence, Providence County, Rhode Island, United States",
    "[city] Foster, Providence County, Rhode Island, United States",
    "[city] Glocester, Providence County, Rhode Island, United States",
    "[city] Johnston, Providence County, Rhode Island, United States",
    "[city] Lincoln, Providence County, Rhode Island, United States",
    "[city] North Providence, Providence County, Rhode Island, United States",
    "[city] North Smithfield, Providence County, Rhode Island, United States",
    "[city] Pawtucket, Providence County, Rhode Island, United States",
    "[city] Providence, Providence County, Rhode Island, United States",
    "[city] Scituate, Providence County, Rhode Island, United States",
    "[city] Smithfield, Providence County, Rhode Island, United States",
    "[city] Woonsocket, Providence County, Rhode Island, United States"
  ],
  "iso1:US,iso2:US-RI,fips:44009": [
    "[county] Washington County, Rhode Island, United States",
    "[city] Charlestown, Washington County, Rhode Island, United States",
    "[city] Exeter, Washington County, Rhode Island, United States",
    "[city] Hopkinton, Washington County, Rhode Island, United States",
    "[city] Narragansett, Washington County, Rhode Island, United States",
    "[city] New Shoreham, Washington County, Rhode Island, United States",
    "[city] North Kingstown, Washington County, Rhode Island, United States",
    "[city] Richmond, Washington County, Rhode Island, United States",
    "[city] South Kingstown, Washington County, Rhode Island, United States",
    "[city] Westerly, Washington County, Rhode Island, United States"
  ]
}

So maybe we need to add some kind of cityId. For example, Detroit City has a city field, but no cityId to key on:

    {
      "state": "Michigan",
      "country": "United States",
      "sources": [
        {
          "name": "Michigan Department of Health & Human Services"
        }
      ],
      "url": "https://www.michigan.gov/coronavirus/0,9753,7-406-98163-520743--,00.html",
      "aggregate": "county",
      "city": "Detroit City",
      "county": "Wayne County",
      "rating": 0.5098039215686274,
      "countryId": "iso1:US",
      "stateId": "iso2:US-MI",
      "countyId": "fips:26163",
      "name": "Detroit City, Wayne County, Michigan, United States",
      "level": "city"
    }
jzohrab commented 3 years ago

(Transferred comment)

Thanks very much, this is great.

If I recall correctly, the Alaska ones are tricky ... they changed things around with their reporting, using some strange region codes.

For the others which have city, do you know if the city values roll up to the state values, or are contained in them? If so, perhaps it would be best to ignore level: city data points. We don't have many of those, and don't have good geo data for them, iirc.

jzohrab commented 3 years ago

(Transferred comment)

For US counties, there should be exactly one county per fips code, it should identify uniquely. Here is the US Census reference document for fips codes <> names: https://www2.census.gov/programs-surveys/popest/geographies/2018/all-geocodes-v2018.xlsx

For cities, I believe they need to be 4-tuples, including city name or city slug of some sort.

So for example "North Kingstown, Washington County, Rhode Island, United States" would be iso1:US,iso2:US-RI,fips:44009,north_kingstown, but that is just my idea. They need 4-tuples for sure.

jzohrab commented 3 years ago

(Transferred comment)

I just looked at the Excel table you linked; it seems that the unique FIPS county code is the 2nd + 3rd column combined, correct?

I was hoping there might be some way to get unique numeric city identifiers, perhaps from the "Consolidtated City Code (FIPS)"[sic] or "Place Code (FIPS)" columns, but at the very least they'd need to be used in conjunction with other columns to be unique. For example here are all lines containing "03220":

Summary Level,State Code (FIPS),County Code (FIPS),County Subdivision Code (FIPS),Place Code (FIPS),Consolidtated City Code (FIPS),Area Name (including legal/statistical area description)
162,01,000,00000,03220,00000,Autaugaville town
162,02,000,00000,03220,00000,Anderson city
061,17,097,03220,00000,00000,Avon township
061,26,157,03220,00000,00000,Arbela township
061,33,001,03220,00000,00000,Barnstead town
061,38,053,03220,00000,00000,Arnegard city
162,38,000,00000,03220,00000,Arnegard city
061,46,085,03220,00000,00000,Bailey township

Or maybe you've looked into this already and it's just impossible?

jzohrab commented 3 years ago

(Transferred comment)

No, I haven't looked into cities and I don't know anything about them. The 5 digit fips code is a standard, look at Wikipedia or anywhere else, they are used in many places. Even the NYT and JHU datasources use the 5 digit one. For cities, I have no idea but Wikipedia lists something for cities.

jzohrab commented 3 years ago

(Transferred comment)

Thanks @hyperknot for jumping in.

@joliss - unless you specifically need city-level data, see if you can roll them up, or use only the state-level data. We don't have reliable/useful city-level data, and most scrapers/sources don't report on that. Can you give that a shot and LMK how it works out?

jzohrab commented 3 years ago

Dup of #479.