cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.75k stars 1.75k forks source link

Connection to Databricks using Cube.py #8436

Open GitMedic opened 2 months ago

GitMedic commented 2 months ago

I am using the below code to test multitenancy in cube cloud , I have two catalog which i pass as the customer_groupd_code

and my cube.py settings is this


@config('driver_factory')
def driver_factory(ctx: dict):
    try:
        customer_group_code = ctx['securityContext']['customer_group_code']
    except KeyError:
        raise ValueError('No customer_group_code found in Security Context!')

    catalog = customer_group_code
    databricks_token = os.getenv('CUBEJS_DB_DATABRICKS_TOKEN')
    jdbc_url = os.getenv('CUBEJS_DB_DATABRICKS_URL')

    if not databricks_token or not jdbc_url:
        raise ValueError('Databricks token or URL not found in environment variables!')

    return {
        "type": "databricks-jdbc",
        "url":jdbc_url,
        "catalog":catalog
    } 
``` , Please help me out in this issue 
GitMedic commented 2 months ago

Also for dynamic table name can i pass like this

`cubes:

igorlukanin commented 2 months ago

Hi @GitMedic 👋

What is not working for you, exactly? I don't see a description of that or an error message in the issue text.

Also, I see that you have to define context_to_orchestrator_id in order for driver_factory to work correctly: https://cube.dev/docs/reference/configuration/config#context_to_orchestrator_id

GitMedic commented 2 months ago

Hello This is my code for cube.py

import os
from cube import config

@config('scheduled_refresh_contexts')
def scheduled_refresh_contexts() -> list[dict]:
    return [
        {
            'securityContext': {
                'tenant_id': "test_1",
                'bucket': 'demo'
            }
        }
    ]

@config('pre_aggregations_schema')
def pre_aggregations_schema(ctx: dict) -> str:
    try:
        return ctx['securityContext']['tenant_id']
    except KeyError:
        raise ValueError('APP_ID_ERROR:No tenant_id found in Security Context!')

@config('context_to_app_id')
def context_to_app_id(ctx: dict) -> str:
    try:
        return f"CUBE_APP_{ctx['securityContext']['tenant_id']}"
    except KeyError:
        raise ValueError('APP_ID_ERROR:No tenant_id found in Security Context!')

@config('context_to_orchestrator_id')
def context_to_orchestrator_id(ctx: dict) -> str:
    try:
        return f"CUBE_APP_{ctx['securityContext']['tenant_id']}"
    except KeyError:
        raise ValueError('ORCHESTRATOR_ID_ERROR:No tenant_id found in Security Context!')

@config('driver_factory')
def driver_factory(ctx: dict):
    try:
        tenant_id = ctx['securityContext']['tenant_id']
    except KeyError:
        raise ValueError('No tenant_id found in Security Context!')

    catalog = tenant_id
    databricks_token = os.getenv('CUBEJS_DB_DATABRICKS_TOKEN')
    jdbc_url = os.getenv('CUBEJS_DB_DATABRICKS_URL')

    if not databricks_token or not jdbc_url:
        raise ValueError('Databricks token or URL not found in environment variables!')

    return {
        "type": "databricks-jdbc",
        "database":catalog,
        "catalog":catalog
    }

class CubeConfig:
    def __init__(self, security_context: dict):
        self.security_context = security_context
        self.validate_security_context()

    def validate_security_context(self):
        if 'tenant_id' not in self.security_context:
            raise ValueError('No Customer Group Code found in Security Context!')

# Example security context for testing
if __name__ == "__main__":
    security_context = {'tenant_id': 'silo_dev_mk'}
    cube_config = CubeConfig(security_context)
    print(f"App ID: {context_to_app_id({'securityContext': security_context})}")
    print(f"Scheduled Refresh Contexts: {scheduled_refresh_contexts()}")`

And Here is my code for my datamodel

cubes:
  - name: report1
    sql_table: "test1.dev.report1"
    joins: []

    dimensions:
      - name: id
        sql: "{CUBE}.`#`"
        type: number
        title: "#"
        primary_key: true
        shown: true`

so the problem is I am getting the Security context passed via token but not able to switch to different catalog in databricks when using the driver factory setting , Can you help me achieve this ? When i try to remove the catalog name "test1" from sql_table it is gives the error as table or view not found "dev.report1" .