OHDSI / ETL-CMS

Workproducts to ETL CMS datasets into OMOP Common Data Model
Apache License 2.0
94 stars 52 forks source link

location_id is null for all care_site records #61

Closed larchiu closed 2 years ago

larchiu commented 4 years ago

Hello,

I download the synpuf dataset and noticed that there is something off with the care_site table records so I'd like to verify ( ftp://ftp.ohdsi.org/synpuf/care_site.csv.gz).

Is it expected that location_id is null for all care_site records?

Is there a way to associate each of the care_site records to specific location_id(s)?

Thanks, Laurence

ChristopheLambert commented 3 years ago

While the original data provided identifiers for both the provider and provider institution, there does not appear to be any location information for either in the source data. Hence null is placed for the location of care_site to signify it is not known.

Patients, on the other hand, do have location information down to state and FIPS county code. One might try to take the most frequent residence state for patient visits to a given care_site to infer a good guess of its location, assuming there is some consistency in the identifiers versus some kind of randomization or blinding being done in the synthetic transformation process that created these pseudo-patients. For instance, there are fields for the "Provider Institution Tax Number", within the Carrier Claims records, but there are way more of these than number of healthcare institutions in the US, suggesting they are random numbers.

ChristopheLambert commented 2 years ago

Closing out issue, as it appears to be a limitation on the source data.

larchiu commented 2 years ago

Thanks @ChristopheLambert !