OHDSI / ETL-CMS

Workproducts to ETL CMS datasets into OMOP Common Data Model
Apache License 2.0
95 stars 52 forks source link

Python dependency #17

Closed vojtechhuser closed 2 years ago

vojtechhuser commented 8 years ago

Inside CMS VRDC, we are not able to run Python. We initiated a project that ports this ETL to pure SQL that can run in SAS (SAS SQL flavor :-( )

aguynamedryan commented 8 years ago

Sorry to hear that. If you're willing and able, please feel free to submit a pull request with your SAS SQL implementation if you'd like it to live in this repository.

markdanese commented 8 years ago

Also, don't hesitate to let us know if you have any questions on the ETL spec, or the test cases.

fabkury commented 8 years ago

Here is the repository with what we have so far (not much, I admit): https://github.com/fabkury/cms_vrdc_etl In short, this is an ETL code built to be executable inside the CMS VRDC. It is not a port of your ETL, but your work was good as an orienting example. We will be presenting the rationale and challenges about this in-VRDC ETL in a poster at the OHDSI Symposium next month. We referenced your Python work in the poster.

dckc commented 8 years ago

Can you run Java in the CMS VRDC? If so, consider Jython.

How about .Net stuff? If so, IronPython might work.

On the other hand, SQL might be a more natural way to represent the ETL. I haven't studied the details.

dckc commented 8 years ago

Now that I have looked into the details a bit, using SQL does seem to be a good approach. In fact, our SamTheEagle project mostly uses SQL. We load the CSV files into Oracle and then use SQL to transform the data.

Our target is i2b2 rather than OMOP, but I think it would be interesting to factor out the common bits.

We haven't managed to publish the SQL code yet. IOU.

ChristopheLambert commented 2 years ago

Closing this request to port the repository to another language.