OHDSI / WhiteRabbit

WhiteRabbit is a small application that can be used to analyse the structure and contents of a database as preparation for designing an ETL. It comes with RabbitInAHat, an application for interactive design of an ETL to the OMOP Common Data Model with the help of the the scan report generated by White Rabbit.
http://ohdsi.github.io/WhiteRabbit
Apache License 2.0
173 stars 85 forks source link

Using WhiteRabbit with Spark, Databricks (Delta Live tables) or Snowflake #394

Open solmazeradat opened 7 months ago

solmazeradat commented 7 months ago

Hi,

Hope you are well.

We are looking at building a pipeline where the data volume/size is of the order of terabits. We want to ensure both the source data as well as the CDM data is compatible with analytical tools for big data as well as the OHDSI analytical tool kit.

Since the scanReport from WhiteRabbit tool is integral to the mapping process, wanted to check if WhiteRabbit supports the use of any of the following:

Spark Databricks (Delta Live tables) Snowflake

Many thanks, Solmaz

janblom commented 7 months ago

Hi,

of the three data platforms you mention, support for Snowflake is currently being developed (some testing is being done as well). For the other two platforms WhiteRabbit currently has no (planned) support. Keep an eye on this issue, I will update it when there is a (test-) release available here.

Given the data volume you mention, you might also want to verify if this issue could be an obstacle for you. Work on this is planned to happen soon (after the Snowflake support).

If all goes a bit as planned, I hope to have a release with both issues solved before the end of 2023.

Best regards, Jan

solmazeradat commented 7 months ago

Hi @janblom,

Thanks very much for the response. Will keep an eye on the proposed releases for Snowflake.

Solmaz

janblom commented 4 months ago

Release candidate 1 for WhiteRabbit 1.0.0, with Snowflake support, has just been made available: https://github.com/OHDSI/WhiteRabbit/releases/tag/v1.0.0-RC1

If you have an opportunity to test, any feedback will be highly appreciated.

Jan