cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.93k stars 1.78k forks source link

Schema files as ES modules #307

Open hassankhan opened 4 years ago

hassankhan commented 4 years ago

Is your feature request related to a problem? Please describe. Currently schema files seem to have slightly different rules for globals and package names (e.g. cube() and import Funnel from 'Funnel').

Describe the solution you'd like It would be awesome if I could import modules using Node's existing rules, so a schema might look like this instead:

import { cube, Funnels } from '@cubejs-backend/schema-compiler';

cube({
  extends: Funnels.eventFunnel
  ...
});

Hopefully by using the ES6 module-loading mechanism, CubeJS might get to do less work with respect to setting up/parsing of schemas.

Describe alternatives you've considered Not sure there are any, other than to stick with what exists.

Additional context This might also make it easier for TypeScript support in schema files, since there would be no need to parse imports.

paveltiunov commented 4 years ago

@hassankhan ES6-style imports are already supported: https://cube.dev/docs/schema-execution-environment#import-export. There's also separate issue for supporting Typescript in schema files: #69.

hassankhan commented 4 years ago

Hi @paveltiunov, I didn't quite mean ES6-style imports, I meant that schema files seem to not be actual ES6/Node modules. As you mentioned, Typescript support in schema files is an open issue. From a maintainer's perspective (currently), it seems that support for any language would need to be added manually, which seems avoidable if schema files were true Node modules.

Just to illustrate what a schema might look like, I took an example from the docs:

+import { createCube } from '@cubejs-backend/schema-compiler';
+import Products from './Products';
+import Users from './Users';
+
-cube(`Orders`, {
+export default createCube(`Orders`, {
  sql: `select * from orders`,

  joins: {
    Users: {
      type: `belongsTo`,
-     sql: `${Orders}.user_id = ${Users}.id`
+     sql: (self) => `${self}.user_id = ${Users}.id`
    },

    Products: {
      type: `belongsTo`,
-     sql: `${Orders}.product_id = ${Products}.id`
+     sql: (self) => `${self}.product_id = ${Products}.id`
    }
  },

  measures: {
    count: {
      type: `count`,
-     drillMembers: [id, status, Products.name, Users.email]
+     drillMembers: ['id', 'status', `${Products}.name`, `${Users}.email`]
    }
  },

  dimensions: {
    id: {
      type: `number`,
      sql: `id`,
      primaryKey: true,
      shown: true
    },

    status: {
      type: `string`,
      sql: `status`
    }
  }
});

Another example:

+import { createCube } from '@cubejs-backend/schema-compiler';
+
-cube(`Orders`, {
+export default createCube(`Orders`, {
  sql: (self, context) => `SELECT * FROM public.orders WHERE ${context.id.filter('user_id')}`,

  measures: {
    count: {
      type: `count`
    }
  }
});

These would be some fairly significant changes I'd imagine (haven't delved too deep into the schema compilation packages yet 😅). However, this upside of this is that there are no globals and that other source parsing utilities (such as ESLint etc.) won't need special configuration to work.

paveltiunov commented 4 years ago

@hassankhan Gotcha. Yeah. You can consider cube.js schema as separate JavaScript dialect itself because it requires transpiling. It provides some domain specific sugar and we plan to add more over time. However we're open to support multiple cube.js schema dialects such as pure ES6 or TypeScript. If you're looking for ES6 support specifically as your final destination we can use this issue to track progress on it.

phpnode commented 4 years ago

@paveltiunov I just discovered cube.js, and it looks like exactly what i'm looking for, but honestly this design decision around schema definitions is a bit surprising, and it took me some time to figure out where cube() was coming from and how field references etc work. If cube used ordinary esmodules and didn't require a transpilation step you'd get TS support for free and make it easier to work with the rest of the ecosystem e.g. eslint

sandyklark commented 4 years ago

Couldn't agree more with @hassankhan and @phpnode. Great library but the @cubejs-backend/schema-compiler feels unnecessarily overcomplicated.

paveltiunov commented 4 years ago

@phpnode @sandyklark Thanks for your feedback guys! Really appreciate it! As any code base that has a long history many not obvious decisions were led by requirements at development time. For schema compiler it’s mostly ability to change schema at run time, execution of untrusted code and analytics friendly language for non JS users. These features are still major to cube.js which we’re committed to support.

We’re open to work hands on with someone who wants to take over this ES6 and TS dialect implementations. We believe there’s a place for multiple cube.js schema dialects.

@sandyklark would love to hear what exactly feels unnecessarily over complicated in schema compiler. Such feedback helps us to improve cube.js internals.

sandyklark commented 4 years ago

I was perhaps slightly unclear in my previous comment - I don't actually have an issue with the way in which @cubejs-backend/schema-compiler is written but rather the process of compiling the schema in a separate process in general. I do think the @babel/parser and vm stuff you've done is pretty cool though. :+1:

But I do think it's rather complex and probably slightly heavy to run (still need to profile this). It seems it would be so much easier and more beneficial (free typescript support as @phpnode mentioned) to just use actual ES modules.

Also, you mentioned that you wanted it to be an analytics friendly language for non JS users, however the schema dialect is so close to Javascript that I'm not sure there is that much benefit to them. If a separate dialect is a hard requirement, it might be nicer to use a custom file extension, for example schema.cubejs, to avoid any confusion for users.

GuillaumeDesforges commented 3 years ago

I came across this issue while trying to import another package into my schema file.

import * as pgescape from "pg-escape"

cube("MyData", {
   ...

and got

Error: Compile errors:
Syntax Error: 'ImportNamespaceSpecifier' import not supported

and got pretty confused, since I was writing a JS file.

Would it be possible to consider allowing developers to not use the CubeJS transpilation, and use a standard ES6 API?

I agree with @sandyklark points.

moltar commented 3 years ago

Wholeheartedly agree with most comments here!

I was super excited by Cube.js for about a year now, and finally tried it out today. I was excited by the fact that I can tightly integrate it into a project by referencing tables and columns via constants defined elsewhere. I was also excited by the fact that the project was moving to TS and would provide type safety.

I had this mirage of being able to define a cube, in a type safe way, using constants from the rest of the project and never had any breakages again between database and reports :)

Then, like, @GuillaumeDesforges I tried to import some files and got errors and then my mirage was dissolved by this issue 😭

I like the approach @hassankhan provided using a factory function.

benswinburne commented 2 years ago

In the absence of this, is there a current means by which to reference a Cube within a function outside the scope of a cube() i.e. if

const setupMeasures = () => {
  // Here I can define sql which references CUBE i.e.
  //           sql: (CUBE) => `${Events.event} = '${event}'`,
  // but I cannot get access to the Events join here
  // If it were exported by the Events cube I could simply
  // import it at the top of my file and use it in this function.
}

cube(`Creatives`, {
  sql: `SELECT * FROM public.creatives`,

  joins: {
    Events: {
      relationship: `hasMany`,
      sql: `${Events}.creative_id = ${CUBE}.id`,
    },
  },

  ...setupMeasures(),