OHDSI / CommonDataModel

Definition and DDLs for the OMOP Common Data Model (CDM)
https://ohdsi.github.io/CommonDataModel
896 stars 451 forks source link

Location Table Updates #91

Closed clairblacketer closed 6 years ago

clairblacketer commented 7 years ago

Location Table Updates


Proposal

Relevant table: Location

Field Required Type Description
location_id Yes integer A unique identifier for each geographic location.
country No varchar(3) The ISO country code
address_1 No varchar(50) The address field 1, typically used for the street address, as it appears in the source data.
address_2 No varchar(50) The address field 2, typically used for additional detail such as buildings, suites, floors, as it appears in the source data.
city No varchar(50) The city field as it appears in the source data.
state No varchar(2) The state field as it appears in the source data.
zip No varchar(9) The zip or postal code.
county No varchar(20) The county.
region No varchar(50) The region field as it appears in the source data.
location_source_value No varchar(50) The verbatim information that is used to uniquely identify the location as it appears in the source data.
latitude No float The geocoded latitude
longitude No float The geocoded longitude

Conventions

JaehyeongCho commented 6 years ago

I think inputting the location information in the observation table would be an alternative for 'Add a way to capture region'

JaehyeongCho commented 6 years ago

I would like to know when the extended location model presented in this issue will be reflected in the OMOP-CDM version.

Will it be reflected in the next OMOP-CDM update?

gowthamrao commented 6 years ago

I second that. Let's get the least controversial fields latitude and longitude in to the location table, at least

cgreich commented 6 years ago

@gowthamrao: What's controversial about the others? I wouldn't call the field "zip" but postalcode (zip is an utterly American thing), and I might challenge whether we can merge the region or county fields, but that's not very controversial. If you think it's ready bring it on. @clairblacketer holds the meeting agenda strings.

gowthamrao commented 6 years ago

Exactly. You just mentioned all the controversies. Region, zip, etc. Lat and long are non controversial. We need to propose that ASAP and try to incorporate into CDM.

cgreich commented 6 years ago

Bring it on.

clairblacketer commented 6 years ago

@gowthamrao @cgreich I'll add this to the agenda for next Tuesday and I will invite Robert Miller as well from the GIS workgroup.

cgreich commented 6 years ago

Well, all of those guys say they just want country (which I don't understand why, since the value will always be the same in a database, with rarest exceptions). Only Robert has GIS, and the Koreans. Want to invite those?

clairblacketer commented 6 years ago

Sure - I invited Robert Miller already and I will extend the invite to the Koreans as well.

rtmill commented 6 years ago

We didn't have time during the CDM call today to discuss this proposal so I figured I'd put my two cents down while the idea is still fresh. To preface, this ignores the other discussion of how to enable the location table to be internationally friendly (in addition to adding a field to represent altitude as Melanie Philofsky has suggested) as I assume it's best to distinguish the topics.

In short, my revised proposal is nearly the exact format that @gowthamrao put forward in his proposal (happy 1 year anniversary Gowtham's post!). The only difference would be to increase the size limit of the country field to 100 characters or so, just in case. TODO: Data mine Gowtham's forum post history, pass off ideas as my own

Is there anyone who disagrees with adding latitude and longitude columns? It appears to be low hanging fruit that we could push forward if we get hung up on the other modifications to the location table.

Differences from current proposal

1 ) Use free text instead of ISO codes to represent countries

2 ) Do not maintain a reference to a region within the location table

Field Required Type Description
location_id Yes integer A unique identifier for each geographic location.
address_1 No varchar(50) The address field 1, typically used for the street address, as it appears in the source data.
address_2 No varchar(50) The address field 2, typically used for additional detail such as buildings, suites, floors, as it appears in the source data.
city No varchar(50) The city field as it appears in the source data.
state No varchar(2) The state field as it appears in the source data.
zip No varchar(9) The zip or postal code.
county No varchar(20) The county.
country No varchar(100) The country as it appears in the source data
location_source_value No varchar(50) The verbatim information that is used to uniquely identify the location as it appears in the source data.
latitude No float The geocoded latitude
longitude No float The geocoded longitude
gowthamrao commented 6 years ago

@rtmill good post.

Altitude - can't it be derived at analytics time? Are there any use case to have altitude only on location table without lat and long? If lat and long are prerequisites, then can't altitude be derived from it thru a lookup?

MelaniePhilofsky commented 6 years ago

We, University of Colorado Denver, are interested in altitude data as a criteria to include patients in a cohort.

clairblacketer commented 6 years ago

added in v6.0