Closed cukarthik closed 6 years ago
Thanks for providing this. Right now the code and documentation we provide is postgreSQL-centric. We did't have SQL Server or Oracle to test those systems.
In order to properly support SQL Server and perhaps Oracle, we would need revised documentation that breaks out the documentation and SQL code into database-specific tracks. Please feel free to do this and initiate a pull request.
Note, there are other repositories that we swiped from to include the SQL code we did, making small modifications to make everything work with OMOP V5.0.0. In the long run it would probably be better to not duplicate across repositories the code to create the database, load the data and vocabulary files, create indices and create constraints. We compromised on this to have most things you needed to get up and running with the synpuf data, but it is probably going to bite us in the long run.
I was dismayed to find that the change from 5.0.0 to 5.0.1 of the OMOP CDM not only changed some fields but deleted and added new tables -- to me that is not a bug-fix release, as it breaks compatibility. In any event, going forward we need a way to maintain compatibility with not only OMOP CDM versions but also database systems (and versions). For instance, unlike with version 9.2, which we used, PostgreSQL 9.5 provides support for bulk-loading gzipped .csv files, removing the inconvenience of having to uncompress these huge files.
Thanks @ChristopheLambert. That makes sense. I figured I'd share whatever I've done so far since we are using SQL Server. Btw, do you have the ACHILLES JSON files for this data set?
Hi @cukarthik
We do have the ACHILLES JSON files -- are they something helpful to post, or would they be site configuration specific?
Hi @ChristopheLambert , It would be helpful for us. We are setting up the synpuf 1 percent database for some students. If you have the JSON files, could we download them somewhere? Thanks.
We are re-creating to ensure it matches that of the uploaded data. Are you wanting the JSON files for the full dataset or just the first 1/20th subset or both?
Just the first 1/20th. Eventually we might want the full set.
Thanks
Sent from my phone
On Aug 27, 2016, at 12:44 AM, Christophe Lambert notifications@github.com<mailto:notifications@github.com> wrote:
We are re-creating to ensure it matches that of the uploaded data. Are you wanting the JSON files for the full dataset or just the first 1/20th subset or both?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://github.com/OHDSI/ETL-CMS/issues/29#issuecomment-242869546, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AMakuU2wW3qjRwEGfCxM-c5f1uKzXGr5ks5qj2xlgaJpZM4JsWs_.
I uploaded a zip file containing the JSON files for AchillesWeb for the first 1/20th of the data here: ftp://ftp.ohdsi.org/synpuf/AchillesExport_1.zip
Please let me know if these work for you.
Hi @cukarthik, It would be helpful if you can share "create_CDMv5_drug_era_non_stockpile.sql" and "create_CDMv5_condition_era.sql" sql scripts for SQL server. Thanks.
Hi @rsurdikar, Unfortunately, we do not have those scripts since we did not include them in our database.
Thanks for the response.
Here's a script to load the synpuf data into SQL Server if you'd like to add it to the project.
TRUNCATE TABLE [care_site]; BULK INSERT [care_site] FROM 'C:\synpuf_1\care_site_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\care_site_1.bad', TABLOCK );
TRUNCATE TABLE [condition_occurrence]; BULK INSERT [condition_occurrence] FROM 'C:\synpuf_1\condition_occurrence_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\condition_occurrence_1.bad', TABLOCK );
TRUNCATE TABLE [death]; BULK INSERT [death] FROM 'C:\synpuf_1\death_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\death_1.bad', TABLOCK );
TRUNCATE TABLE [device_exposure]; BULK INSERT [device_exposure] FROM 'C:\synpuf_1\device_exposure_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\device_exposure_1.bad', TABLOCK );
TRUNCATE TABLE [drug_exposure]; BULK INSERT [drug_exposure] FROM 'C:\synpuf_1\drug_exposure_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\drug_exposure_1.bad', TABLOCK );
TRUNCATE TABLE [location]; BULK INSERT [location] FROM 'C:\synpuf_1\location_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\location_1.bad', TABLOCK );
TRUNCATE TABLE [measurement]; BULK INSERT [measurement] FROM 'C:\synpuf_1\measurement_occurrence_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\measurement_occurrence_1.bad', TABLOCK );
TRUNCATE TABLE [observation]; BULK INSERT [observation] FROM 'C:\synpuf_1\observation_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\observation_1.bad', TABLOCK );
TRUNCATE TABLE [observation_period]; BULK INSERT [observation_period] FROM 'C:\synpuf_1\observation_period_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\observation_period_1.bad', TABLOCK );
TRUNCATE TABLE [payer_plan_period]; BULK INSERT [payer_plan_period] FROM 'C:\synpuf_1\payer_plan_period_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\payer_plan_period_1.bad', TABLOCK );
TRUNCATE TABLE [person]; BULK INSERT [person] FROM 'C:\synpuf_1\person_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\person_1.bad', TABLOCK );
TRUNCATE TABLE [procedure_occurrence]; BULK INSERT [procedure_occurrence] FROM 'C:\synpuf_1\procedure_occurrence_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\procedure_occurrence_1.bad', TABLOCK );
TRUNCATE TABLE [provider]; BULK INSERT [provider] FROM 'C:\synpuf_1\provider_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\provider_1.bad', TABLOCK );
TRUNCATE TABLE [specimen]; BULK INSERT [specimen] FROM 'C:\synpuf_1\specimen_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\specimen_1.bad', TABLOCK );
TRUNCATE TABLE [visit_occurrence]; BULK INSERT [visit_occurrence] FROM 'C:\synpuf_1\visit_occurrence_1.csv' WITH ( FIRSTROW = 2, FIELDTERMINATOR = ',', ROWTERMINATOR = '0x0a', ERRORFILE = 'C:\synpuf_1\visit_occurrence_1.bad', TABLOCK );