Closed leenadR closed 5 years ago
Is cost in the Dev branch format? This CDM definition is approved, but waiting 2018 ohdsi symposium for release as v6.0
No, I only use the release versions (tagged 5.2, 5.0, ...)
See line 598: https://github.com/OHDSI/CommonDataModel/blob/v5.2.0/PostgreSQL/OMOP%20CDM%20ddl%20-%20PostgreSQL.sql
Even in the dev-branch, the columns would not fit ... CREATE TABLE cost ( cost_id INTEGER NOT NULL , person_id INTEGER NOT NULL, cost_event_id INTEGER NOT NULL , cost_domain_id VARCHAR(20) NOT NULL ,
@leenadR where did you download this? Just want to make sure I know which copy you are working with. I'm assuming it is the new one.
I think we have seen this problem before, the COST_ID is the last column in this export of SYNPUF. :(
Tagging @leeevans
@leenadR this is an issue with the file format of the Synpuf 100k export file. It's been corrected in a replacement Synpuf 5% sample file that is now available for download.
You can use the 'CMS SynPUF 5% Sample' link at the bottom of the below web page to download the new file:
Many thanks, the 5% sample works just fine. => issue can be closed
Many thanks for providing the 100k download vie Google drive. However, I experienced some problems when importing the data.
drug_era
Importing drug_era fails: ERROR: invalid input syntax for integer: "2008-02-27"
Looking at the data, the columns don't match with the table definition in CDM (I checked also versions before and after 5.2): The third column should be the concept_id, not a date.
gunzip -c drug_era.gz | head 4107,528323,2008-02-27,2008-02-27,1,\000,215049 2060,722031,2008-03-11,2008-03-21,1,\000,105476 4938,710062,2008-02-14,2008-03-15,1,\000,258700
CREATE TABLE drug_era ( drug_era_id INTEGER NOT NULL , person_id INTEGER NOT NULL , drug_concept_id INTEGER NOT NULL , drug_era_start_date DATE NOT NULL , drug_era_end_date DATE NOT NULL , drug_exposure_count INTEGER NULL , gap_days INTEGER NULL ) ;
observation_period
Similarly, the data for observation_period still has the observation_period_start_datetime and observation_period_stop_datetime columns, which were deprecated before 5.2.
cost
Last, the cost data does not fit the column definition by the CDM. The import fails, because "Procedure" is not an INTEGER.
gunzip -c cost.gz | head 189455742,Procedure,0,44818668,\000,\000,10,10,0,\000,0,0,0,\000,\000,\000,\000,\000,\000,0,\000,543966571 189455742,Procedure,0,44818668,\000,\000,20,20,0,\000,0,0,0,\000,\000,\000,\000,\000,\000,0,\000,543966572 189455743,Procedure,0,44818668,\000,\000,10,10,0,\000,0,0,0,\000,\000,\000,\000,\000,\000,0,\000,543966573
CREATE TABLE cost ( cost_id INTEGER NOT NULL , cost_event_id INTEGER NOT NULL , cost_domain_id VARCHAR(20) NOT NULL , cost_type_concept_id INTEGER NOT NULL , currency_concept_id INTEGER NULL , total_charge NUMERIC NULL , total_cost NUMERIC NULL , total_paid NUMERIC NULL , paid_by_payer NUMERIC NULL , paid_by_patient NUMERIC NULL , paid_patient_copay NUMERIC NULL , paid_patient_coinsurance NUMERIC NULL , paid_patient_deductible NUMERIC NULL , paid_by_primary NUMERIC NULL , paid_ingredient_cost NUMERIC NULL , paid_dispensing_fee NUMERIC NULL , payer_plan_period_id INTEGER NULL , amount_allowed NUMERIC NULL , revenue_code_concept_id INTEGER NULL , reveue_code_source_value VARCHAR(50) NULL , drg_concept_id INTEGER NULL, drg_source_value VARCHAR(3) NULL ) ;