hroptatyr / dateutils

nifty command line date and time utilities; fast date calculations and conversion in the shell
http://www.fresse.org/dateutils/
Other
618 stars 42 forks source link

tzmaps branch is missing a couple of new popular airports #167

Open ysangkok opened 2 weeks ago

ysangkok commented 2 weeks ago

By using the following sparql query on http://query.wikidata.org/

SELECT DISTINCT ?item ?itemLabel ?iata
WHERE
{
  ?item wdt:P3872 ?patronage.
  ?item wdt:P238 ?iata.
  FILTER (?patronage > 500000).
  FILTER NOT EXISTS {?item wdt:P576 ?demolishedDate}.
  FILTER NOT EXISTS {?item wdt:P1366 ?replacedBy}.
  FILTER NOT EXISTS {?item wdt:P31 wd:Q15893266}. # filter out instances of 'former entity'
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en". }
}

and saving to CSV as wikidata.csv, and then running this script, I find a couple of airports that aren't in here:

import csv

m={}
with open("iata.tzmap") as f:
    reader = csv.reader(f, delimiter='\t')
    for row in reader:
      assert row[0]
      assert row[1]
      m[row[0]] = row[1]

with open('wikidata.csv') as f:
    reader = csv.DictReader(f)
    for row in reader:
      iata = row['iata']
      airport_name = row['itemLabel']
      if iata not in m:
          print(f'{airport_name=} with {iata=} is missing in iata.tzmap')

emits

airport_name='Gagarin International Airport' with iata='GSV' is missing in iata.tzmap
airport_name='Nursultan Nazarbayev International Airport' with iata='NQZ' is missing in iata.tzmap
airport_name='Dazhou Jinya Airport' with iata='DZH' is missing in iata.tzmap
airport_name='Rize–Artvin Airport' with iata='RZV' is missing in iata.tzmap
airport_name='Bicol International Airport' with iata='DRP' is missing in iata.tzmap
airport_name='Chengdu Tianfu International Airport' with iata='TFU' is missing in iata.tzmap

Gagarin opened in 2019. Bicol and Chengdu in 2021. Dazhou Jinya Airport opened in 2022. Rize-Artvin also in 2022. Them being recent might explain why they are missing.

Wikipedia notes that Nursultan changed it's IATA code:

On 8 June 2020 the airport officially changed its three-character IATA airport code from TSE to NQZ.

hroptatyr commented 2 weeks ago

Hi Janus, thanks for your research. I've got a much bigger list of changes sourced from IATA and geonames.

In your query I think you can skip the wdt:P576 and wdt:P1366 filters. The list is supposed to contain historic entries if they are currently unused.

I'm trying to merge the changes within the next week, maybe even publish back to wikidata. It's a lot of changes because this isn't my prime area, and I neglected updating the list.