datasets / un-locode

United Nations Codes for Trade and Transport Locations (UN/LOCODE) and Country Codes
https://datahub.io/core/un-locode
142 stars 55 forks source link

Aliases contain many invalid entries #27

Open cristan opened 1 week ago

cristan commented 1 week ago

As far as I can tell: these aliases are useless:

GL,Fredrikshaab = Paamiut,Fredrikshaab = Paamiut
GL,Godhavn = Qeqertarsuaq,Godhavn = Qeqertarsuaq
GL,Gronnedal = Kangilinguit,Gronnedal = Kangilinguit

etc. See this logs:

Paamiut in GL not found!
Qeqertarsuaq in GL not found!
Kangilinguit in GL not found!
Sisimiut in GL not found!
Ilulissat in GL not found!
Qaqortoq in GL not found!
Illorqortoormiut in GL not found!
Kangerlussua in GL not found!
Manitsoq in GL not found!
Bishkek in KG not found!
Thandwe in MM not found!
Bagan Luar in MY not found!
Nizhny Novgorod in RU not found!
La Brea in TT not found!
Adak Island in US not found!
Sibbo (Sipoo) in FI not found!
Sibbo (Sipoo) in FI not found!
Vanda (Vantaa) in FI not found!
Chuuk in FM not found!
Fuglafirdi in FO not found!

This is a working alias:

BE,Antwerp = Antwerpen,Antwerp = Antwerpen

This is useful: when searching for Antwerp, you actually want Antwerpen (BEANR, which actually has "Antwerpen" as its name).

Compare this with this one:

GL,Fredrikshaab = Paamiut,Fredrikshaab = Paamiut

This is quite useless: the name of GLJFR is Paamiut (Fredrikshaab) and this isn't in the aliases, so I can't tell which UN/LOCODE is meant. Or am I missing something? Many of the incorrect ones are added in the latest release.

sabas commented 3 days ago

alias.csv is created by stripping the rows with = as status in the original dataset, perhaps the rows were kept while the name changed in one of the latest releases?

cristan commented 3 days ago

No, that's not it. Many aliases are changed in 2024-1, so they are as new as they can be and many of these are wrong. In fact, it already goes wrong at the second new entry:

GL,Fredrikshaab = Paamiut,Fredrikshaab = Paamiut

However GLJFR has the name Paamiut (Fredrikshaab) and that's not in the aliases.