The Public Health Environmental Surveillance Open Data Model (PHES-ODM, or ODM). A data model, dictionary and support tools for environmental surveillance.
Creative Commons Attribution Share Alike 4.0 International
54
stars
18
forks
source link
ISO Code mistakes in the zones table of version 2.2; Repeating partIDs #267
1) The is a mistake in the zones GitHub CSVs. The isoCode field should be 2 characters wide and contain only the country code, no zones codes. We have instances where zones are nested (to some extent) and so a zones code is then used in the isoCode column.
For example, for Belgium we have:
isoCode
isoZone
zoneName
BE
BE-BRU
Brussels Hoofdstedelijk Gewest
BE
BE-VLG
Vlaams Gewest
BE
BE-WAL
Waals Gewest
BE-VLG
BE-VAN
Antwerpen
BE-VLG
BE-VBR
Vlaams-Brabant
BE-VLG
BE-VLI
Limburg
BE-VLG
BE-VOV
Oost-Vlaanderen
BE-VLG
BE-VWV
West-Vlaanderen
BE-WAL
BE-WBR
Brabant wallon
BE-WAL
BE-WHT
Hainaut
BE-WAL
BE-WLG
Liège
BE-WAL
BE-WLX
Luxembourg
BE-WAL
BE-WNA
Namur
BE-VLG and BE-WAL are larger areas that are subdivided in several zones. Brussels is its own zone. Because of this, however, the countries and zones tables cannot be properly linked. The error is:
“Column 'countries.isoCode' is not the same length or scale as referencing column 'zones.isoCode' in foreign key 'FK_zones.isoCode'. Columns participating in a foreign key relationship must be defined with the same length and scale.”.
The error is justified, since there are no BE-VLG or BE-WAL records in the countries table, isoCode column.
The solution will be just to truncate the isoCode column entries that cite zones in the zones CSV to just the 2 character country code.
2) Another issue is that there are duplicate partIDs in the parts table. This isn’t allowed by databases as partID is a primary key.
The solution here is to rename one of the duplicate entries in all cases, while not accidentally creating new duplicates.
The current culprits are:
partID
partLabel
conc
Sample concentrate
conc
Concentration measure
frna
F-Specific RNA bacteriophages
fRNA
F+ RNA coliphage
n211i
n211i omicron-variant gene target
n211i
Omicron Variant n211i mutation
n679k
n679k omicron-variant gene target
n679k
Omicron Variant n679k mutation
n856k
n856k omicron-variant gene target
n856k
Omicron Variant n856k mutation
n969k
n969k omicron-variant gene target
n969k
Omicron Variant n969k mutation
p100l
p100l delta-variant gene target
p100l
Delta Variant p100l mutation
p2046l
p2046l delta-variant gene target
p2046l
Delta Variant p2046l mutation
p2287s
p2287s delta-variant gene target
p2287s
Delta Variant p2287s mutation
p3395h
p3395h omicron-variant gene target
p3395h
Omicron Variant p3395h mutation
p681r
p681r delta-variant gene target
p681r
Delta Variant p681r
phone
Contact phone
phone
Country national phone prefix
I'll turn this issue into a PR and try to merge this patch fix by EOD, and launch a v2.2.1 release.
This issue was raised by @sorinsion via email.
The issue is two-fold:
1) The is a mistake in the zones GitHub CSVs. The isoCode field should be 2 characters wide and contain only the country code, no zones codes. We have instances where zones are nested (to some extent) and so a zones code is then used in the isoCode column.
For example, for Belgium we have:
BE-VLG and BE-WAL are larger areas that are subdivided in several zones. Brussels is its own zone. Because of this, however, the countries and zones tables cannot be properly linked. The error is:
“Column 'countries.isoCode' is not the same length or scale as referencing column 'zones.isoCode' in foreign key 'FK_zones.isoCode'. Columns participating in a foreign key relationship must be defined with the same length and scale.”.
The error is justified, since there are no BE-VLG or BE-WAL records in the countries table, isoCode column.
The solution will be just to truncate the isoCode column entries that cite zones in the zones CSV to just the 2 character country code.
2) Another issue is that there are duplicate partIDs in the parts table. This isn’t allowed by databases as partID is a primary key.
The solution here is to rename one of the duplicate entries in all cases, while not accidentally creating new duplicates.
The current culprits are:
I'll turn this issue into a PR and try to merge this patch fix by EOD, and launch a v2.2.1 release.