Closed cyongg closed 6 days ago
Databricks is not a supported DB for hosting the WebAPI database. Only postgres is supported.
Databricks is not a supported DB for hosting the WebAPI database. Only postgres is supported.
Im not hosting webapi database on Databricks but Im hosting my OMOP CDM on databricks, our WebAPI database is on postgres so what should I do to be able to make the WebAPI work?
To enable databricks, you can compile the .war with the Databricks per: https://github.com/OHDSI/WebAPI/issues/2262#issuecomment-1917673045
You modified the postgres-sql
profile to change the datasource URL to be databricks, which tells webAPI where to find the WebAPI database. You should leave this alone and follow the setup instructions to just specify the url to your own Postgres instance in your environment.
To point to a databricks CDM, you follow the instructions for setting up a source, using the dialect of 'databricks' and the JDBC url for your databricks server. As an admin user, you can use the configuration nav to add and remove sources.
You modified the
postgres-sql
profile to change the datasource URL to be databricks, which tells webAPI where to find the WebAPI database. You should leave this alone and follow the setup instructions to just specify the url to your own Postgres instance in your environment.To point to a databricks CDM, you follow the instructions for setting up a source, using the dialect of 'databricks' and the JDBC url for your databricks server. As an admin user, you can use the configuration nav to add and remove sources.
I thought the datasource.url suppose to be the jdbc url of databricks as datasource isn't it suppose to refer to where the CDM is located?
No. CDMs are referenced in the Source table (please refer to indicated documentation above). 1 WebAPI can reference multiple CDMs which you can add/remove over time, so it doesn't make sense that we'd store it in a configuration file. It's application data.
does that mean I'm only required to use the sql below for the source and source_daemon table?
INSERT INTO webapi.source (source_id, source_name, source_key, source_connection, source_dialect)
SELECT nextval('webapi.source_sequence'), 'databricks', 'databricks', 'jdbc:spark://<workspace>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/5269192184046240/0404-024850-qo6kj1mb;UseNativeQuery=1;AuthMech=3;UID=token;PWD=<token>', 'spark';
INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority)
SELECT nextval('webapi.source_daimon_sequence'), source_id, 0, 'silver', 0
FROM webapi.source
WHERE source_key = 'databricks'
;
INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority)
SELECT nextval('webapi.source_daimon_sequence'), source_id, 1, 'silver', 1
FROM webapi.source
WHERE source_key = 'databricks'
;
INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority)
SELECT nextval('webapi.source_daimon_sequence'), source_id, 2, 'resultss', 1
FROM webapi.source
WHERE source_key = 'databricks'
;
INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority)
SELECT nextval('webapi.source_daimon_sequence'), source_id, 5, 'temp', 0
FROM webapi.source
WHERE source_key = 'databricks'
;
does that mean I'm only required to use the sql below for the source and source_daemon table?
INSERT INTO webapi.source (source_id, source_name, source_key, source_connection, source_dialect) SELECT nextval('webapi.source_sequence'), 'databricks', 'databricks', 'jdbc:spark://<workspace>:443/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/5269192184046240/0404-024850-qo6kj1mb;UseNativeQuery=1;AuthMech=3;UID=token;PWD=<token>', 'spark'; INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) SELECT nextval('webapi.source_daimon_sequence'), source_id, 0, 'silver', 0 FROM webapi.source WHERE source_key = 'databricks' ; INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) SELECT nextval('webapi.source_daimon_sequence'), source_id, 1, 'silver', 1 FROM webapi.source WHERE source_key = 'databricks' ; INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) SELECT nextval('webapi.source_daimon_sequence'), source_id, 2, 'resultss', 1 FROM webapi.source WHERE source_key = 'databricks' ; INSERT INTO webapi.source_daimon (source_daimon_id, source_id, daimon_type, table_qualifier, priority) SELECT nextval('webapi.source_daimon_sequence'), source_id, 5, 'temp', 0 FROM webapi.source WHERE source_key = 'databricks' ;
Hmm but why is the configuration here like this it can't seems to identify my vocab version
In databricks I my cdm catalog name is omop540 and the schema name is silver so how do I connect to it?
@chanjunkai11 - your setup appears correct. Is your vocabulary
table populated in your CDM? That is where the vocabulary version is obtained and displayed in ATLAS.
Since this issue is a bit old, I'm hoping you were able to figure this out but please re-open if you are still facing difficulties.
Goal : Trying to connect OHDSI Atlas with OMOP CDM in Databricks with Tomcat.
Caused by: org.flywaydb.core.internal.dbsupport.FlywaySqlException: Unable to obtain Jdbc connection from DataSource
SQL State : 08S01 Error Code : 500593 Message : [Databricks]JDBCDriver Communication link failure. Failed to connect to server. Reason: HTTP Response code: 401, Error message: Unknown.
Process to recreate problem
Step 1 this is the SETTING.XML
Step 2 We have included the databricks dependency in the POM.XML
Step 3 We have copied and placed DatabricksJDBC42.jar file in the classpath. (C:\Program Files\Apache Software Foundation\Tomcat 8.5\webapps\WebAPI\WEB-INF\lib)
JDBC url (From Databricks cluster) >
jdbc:databricks://{{workspace id}:{workspace port}/default;transportMode=http;ssl=1;httpPath=sql/protocolv1/o/5269192184046240/0404-024850-qo6kj1mb;AuthMech=3;UseNativeQuery=1;ConnCatalog={default catalog};UID={UID};PWD={token}