Open reedy opened 1 year ago
I doubt the fix to https://github.com/jpatokal/openflights/issues/1116 is to blame....
But I'm guessing something to do with differing plane ids locally..?
mysql> select * from planes;
+-------------------------+------+-------+------+--------+
| name | abbr | speed | plid | public |
+-------------------------+------+-------+------+--------+
| Airbus A320 | NULL | NULL | 1 | NULL |
| Boeing 777-300ER | NULL | NULL | 2 | NULL |
| Boeing 737-MAX9 | NULL | NULL | 3 | NULL |
| Boeing 737-800 | NULL | NULL | 4 | NULL |
| Embraer ERJ175 | NULL | NULL | 5 | NULL |
| Airbus A319 | NULL | NULL | 6 | NULL |
| Airbus A321NEO | NULL | NULL | 7 | NULL |
| Embraer E175 | NULL | NULL | 8 | NULL |
| Airbus A321 | NULL | NULL | 9 | NULL |
| Airbus A350-900 | NULL | NULL | 10 | NULL |
| Airbus A330-300 | NULL | NULL | 11 | NULL |
| Embraer E190 | NULL | NULL | 12 | NULL |
| Airbus A320NEO | NULL | NULL | 13 | NULL |
| Embraer 190 | NULL | NULL | 14 | NULL |
| Boeing 777-200 | NULL | NULL | 15 | NULL |
| Boeing 787-8 | NULL | NULL | 16 | NULL |
| BAE Jetstream J41 | NULL | NULL | 17 | NULL |
| Airbus A380 | NULL | NULL | 18 | NULL |
| Embraer E190SR | NULL | NULL | 19 | NULL |
| Bombardier CRJ-900 | NULL | NULL | 20 | NULL |
| Boeing 737-700 | NULL | NULL | 21 | NULL |
| Boeing 747-400 | NULL | NULL | 22 | NULL |
| Embraer RJ145 | NULL | NULL | 23 | NULL |
| Airbus A340-600 | NULL | NULL | 24 | NULL |
| Boeing 777-200ER | NULL | NULL | 25 | NULL |
| Bombardier CRJ-1000 | NULL | NULL | 26 | NULL |
| Airbus A350-1000 | NULL | NULL | 27 | NULL |
| Boeing 767-300 | NULL | NULL | 28 | NULL |
| Airbus A330-200 | NULL | NULL | 29 | NULL |
| Boeing 757-200 | NULL | NULL | 30 | NULL |
| Boeing 787-9 | NULL | NULL | 31 | NULL |
| Boeing 777-200LR | NULL | NULL | 32 | NULL |
| Airbus A318 | NULL | NULL | 33 | NULL |
| McDonnell Douglas MD-83 | NULL | NULL | 34 | NULL |
| Boeing 767-300ER | NULL | NULL | 35 | NULL |
| Embraer E170 | NULL | NULL | 36 | NULL |
| Canadair CRJ-700 | NULL | NULL | 37 | NULL |
| Canadair CRJ-900 | NULL | NULL | 38 | NULL |
| McDonnell Douglas MD83 | NULL | NULL | 39 | NULL |
| Boeing 787-800 | NULL | NULL | 40 | NULL |
| Bombardier Q400 | NULL | NULL | 41 | NULL |
| Boeing 737-400 | NULL | NULL | 42 | NULL |
| Boeing 737-300 | NULL | NULL | 43 | NULL |
| Airbus A321T | NULL | NULL | 44 | NULL |
| Airbus A320-200 | NULL | NULL | 45 | NULL |
| Avro RJ85 | NULL | NULL | 46 | NULL |
| Dornier J328 | NULL | NULL | 47 | NULL |
| Fokker 100 | NULL | NULL | 48 | NULL |
| Airbus A321-200 | NULL | NULL | 49 | NULL |
| ATR 72 | NULL | NULL | 50 | NULL |
| Saab 2000 | NULL | NULL | 51 | NULL |
| McDonnell Douglas MD80 | NULL | NULL | 52 | NULL |
| Canadair CRJ-200 | NULL | NULL | 53 | NULL |
| Embraer ERJ-145 | NULL | NULL | 54 | NULL |
| Boeing 767-200 | NULL | NULL | 55 | NULL |
| McDonnell Douglas MD82 | NULL | NULL | 56 | NULL |
| Fokker 70 | NULL | NULL | 57 | NULL |
| BAe 146-200 | NULL | NULL | 58 | NULL |
| Boeing 767 | NULL | NULL | 59 | NULL |
+-------------------------+------+-------+------+--------+
59 rows in set (0.00 sec)
If I truncate the table, then reimport... They all come in empty...
Caused by https://github.com/jpatokal/openflights/commit/2d825118307d17e34b68dd99f3cceea1b605101b as the plane id (plid
) is actually being exported?
But I'm guessing something to do with differing plane ids locally..?
mysql> select * from planes; +-------------------------+------+-------+------+--------+ | name | abbr | speed | plid | public | +-------------------------+------+-------+------+--------+ | Airbus A320 | NULL | NULL | 1 | NULL | snip | Boeing 767 | NULL | NULL | 59 | NULL | +-------------------------+------+-------+------+--------+ 59 rows in set (0.00 sec)
I notice that planes.dat
does not contain a plid
column. With the live prod database having been updated multiple times, I don't think we can easily replicate the IDs ourselves.
Caused by 2d82511 as the plane id (
plid
) is actually being exported?
@reedy, I've looked into this today and I don't think this is a bug.
The export/backup functionality will export the database IDs for multiple entities:
Data | CSV column name |
---|---|
Origin Airport | From_OID |
Destination Airport | To_OID |
Airline | Airline_OID |
Plane Type | Plane_OID |
When present in the upload payload, this allows the import code to easily match entities even if they were renamed or otherwise updated between the backup creation and upload. Importantly, during parsing, these take priority over the string columns (e.g. airline
, plane
) [^1].
airports.dat
and airlines.dat
both contain an ID column which load-data.sql
uses when seeding the database with data. This makes the IDs consistent between the real openflights.org and dev setups.
However, planes.dat
does not have an ID column. As a result, your local instance with either fail to match planes or match them incorrectly.
The solution is to wipe the Plane_OID
column (best to wipe all the OID columns) in your CSV before doing an import on the local instance. I've tested this with my dev setup.
The data files have not been updated in a while (7 years for airlines.dat
, 4 years for both planes.dat
and airports.dat
) so some entries are marked as "No matches, will be added as new" (e.g. Wizz Air UK).
[^1]: For example, https://github.com/jpatokal/openflights/blob/2f86a25d09c0721a2b37aa464bd7c474244fdeab/php/import.php#L497-L505 uses the Airline ID if present and otherwise falls back to string matching.
See my comment on 992 - https://github.com/jpatokal/openflights/pull/992#issuecomment-1602756677
Users are allowed to enter their own plane designations so I’m curious as to how this might play out here as well. They may or may not confirm to existing ids, even if there were an Id column.
It doesn't change the situation much here. I only mentioned the OIDs matching as an explanation for why this only happened to planes. You should wipe all the OIDs when moving backups across database instances.
For a local install, you have to clear that column (and the other OID columns as well). The import process will automatically create new entities for you if it can't match them based on the name.
I note after deleting the databases, the updated imports etc.. Then importing my latest backup from openflights.org, the planes are wrong or mostly missing
You need to delete the OIDs from the CSV when importing across instances to get the new instance to create whatever is not present in the database - try it. This only affects people moving data across instances.
Aha, thanks. I wonder if this should almost be a difference between "backup" and "export"...
Probably should be documented somewhere (on the site) too
Aha, thanks. I wonder if this should almost be a difference between "backup" and "export"...
I think exporting them is correct because the names can (and do, even if infrequently) change while the IDs are stable. This helps if your backup is not recent. I think this is WAD. Let's be honest, either you're using openflights.org or you're running your own instance. I think this is only an issue for us. :)
Probably should be documented somewhere (on the site) too
I'm not sure about the site itself since I don't believe this applies to users of openflights.org. How about the repo docs? You're probably going to read these if you're running your own instance.
On my dev install... I deleted all my flights (using the UI) and then imported a newer backup from openflights.org
It seems to have resulted in some weird plane type attributions, even though the exported CSV looks good
Should this be showing IDs or types?