OHDSI / CommonDataModel

Definition and DDLs for the OMOP Common Data Model (CDM)
https://ohdsi.github.io/CommonDataModel
875 stars 447 forks source link

Make Location Table International #365

Closed PRijnbeek closed 2 years ago

PRijnbeek commented 3 years ago

In the EHDEN project we have many data sources that have location data but it is always a bit embarrassing to have to explain that this table is only for the Americans :) We do not have a country field and state has only 2 positions and ZIPs 9 positions.

Could we take this into consideration in the next update? or this already on the roadmap?

cgreich commented 3 years ago

Not at all. This is where you live: 42032261.

ericaVoss commented 3 years ago

image

gklebanov commented 3 years ago

https://athena.ohdsi.org/search-terms/terms/42032261

OSM is international https://www.openstreetmap.org/relation/1411101

we do have a country field, but zip codes and states - would be good to review those.

clairblacketer commented 3 years ago

@PRijnbeek did Christian answer your question or is there more to be done here?

PRijnbeek commented 3 years ago

No. The current location table has these fields:

image

It is unclear to me where @cgreich and @gklebanov are referring to. The location tables has no Country field as @gklebanov is suggesting as far as i can see? It has State varchar(2)? and County. It is unclear where this OSM is used. What do i and many others miss?

gklebanov commented 3 years ago

@PRijnbeek oh, no - apologies for misleading you, I just realized the "country" field was never actually added to the location table. I vividly remember that discussion and "someone" argued that the CDM data does not go across multiple countries so there is no need for a "country" field. I still believe that - while this is probably true in most of the cases - but not all, and "country" field would be good to have, even if for consistency.

As far as States / Provinces, I am assuming it means the use of the ISO 3166-1 and ISO 3166-2 standards. https://en.wikipedia.org/wiki/List_of_ISO_3166_country_codes

I agree with Peter that we should at least start with a session reviewing the existing design / decisions so that there is no place left for assumptions and we better document it, including the use outside of the US.

@clairblacketer - would it be a good topic for our next CDM 6.x discussion?

PRijnbeek commented 3 years ago

i agree that this is true for 99% of the data sources, but we did have one example already where they send patients to a hospital across the border and do record that fact. I am not pushing for this because i see clear use cases either at the moment but the simple fact that the table is not only focussed on US is a good thing to show the global focus of OHDSI (and it would avoid that same discussion in ever call we have to explain the location table and cannot answer the question why they cannot fill in their region and country)

gklebanov commented 3 years ago

yes, even BioBank UK which we just did - would be an interesting example of data to look at. UK vs. England, Scotland, Wales - all countries https://en.wikipedia.org/wiki/ISO_3166-2:GB#:~:text=ISO%203166%2D2%3AGB%20is,states)%20of%20all%20countries%20coded

clairblacketer commented 3 years ago

@gklebanov and @PRijnbeek yes let's definitely add this as a CDM v6.1 update. Here's what it looks like we need:

Also related to https://github.com/OHDSI/CommonDataModel/issues/252

cgreich commented 3 years ago

Friends:

The whole V6/datetime debacle has convinced me that we need to be REALLY strict on our modelling approach. Like Supreme Courts: They are not taking up cases unless somebody files them. Same for us. We need a use case. Calling Scotland a country doesn't make it one. It is still (!) part of the UK, and you can send a letter from within the UK to any Scot without specifying another country - the post code is sufficient. Same is true for Catalonia, Transnistria, Nagorni Karabach. Until these places become countries they are still regions.

So, here is my hard-assed approach. Please convince me of the opposite:

So, I would claim we stop the business of storing address information (which we never have anyway). Leaves us with the zip code in the US. Or some regional information, like in the UK, or DA Germany. We would need to add these to the Geography domain. But that is a Vocabulary job. Please submit an issue.

The Location should just have two main fields: location_id and location_concept_id. And now that I said that, @clairblacketer: I realize it is missing from 6.0!!! How did that happen?

PRijnbeek commented 3 years ago

Thanks Christian but you actually prove my point. it is simply not there now in any CDM version. Your proposal of using the concept_id above makes no sense since there is no field to store it in.

The current version of this table is not international, period.

I also think this is a completely different discussion than the datetime proposal.

I agree the Scotland, UK example does not help here.

cgreich commented 3 years ago

Correct. I realized after giving the entire Sermon that the obvious solution is not there today. :( But we should push it in. We can have any patient from any place in any database.

The datetime proposal I only mentioned because of the egg we are scraping off our faces. Should have never done this without an analytical use case.

Jake-Gillberg commented 3 years ago

So, I would claim we stop the business of storing address information (which we never have anyway)

Given this comment and the discussion in the CDM workgroup meeting on 2020-03-16, I'd like to mention that the GIS workgroup is looking at ways to leverage address information (which most health systems do have), in network studies by a process of geocoding and combining with external geographic attribute / exposure data, or rolling up to regional aggregates.