cube-js / cube

📊 Cube — Universal semantic layer platform for AI, BI, spreadsheets, and embedded analytics
https://cube.dev
Other
17.97k stars 1.78k forks source link

How to join same table twice ? #3692

Open wenerme opened 3 years ago

wenerme commented 3 years ago

Problem

Want to join same table twice, but the key must be the cube name

Related Cube.js schema

// how to avoid this ?
cube(`CalleeUsers`, {
  extends: Users,
});

cube(`PhoneCallEvents`, {
  sql: `SELECT * FROM public.phone_call_events`,
  joins: {
    // want to use the key as CallerUser, but CallerUser cube not exists
    Users: {
     toCude: Users, // maybe support specify the target cube in here ?
      relationship: `belongsTo`,
      sql: `${PhoneCallEvents}.caller_object_id = ${Users}.id`,
    },
    // above cube make this work
    CalleeUsers: {
      relationship: `belongsTo`,
      sql: `${PhoneCallEvents}.callee_object_id = ${CalleeUsers}.id`,
    },
  },
}
)

New to cubejs, don't know if there is a way to handle join same table twice.

ivan-vdovin commented 2 years ago

Hi, @wenerme ! Actually, what you are looking for looks like one of our examples. Try to add CalleeUsers and CallerUser with your conditions instead of ones from example. Take a look there and let me know if it helps.

wenerme commented 2 years ago

Hi, @ivan-vdovin

Join table twice I mean

create table phone_call_events (
  caller_object_id bigint,
  callee_object_id bigint,
  created_by_user_id bigint -- maybe join more users
);

Both of caller_object_id, callee_object_id are reference the user, no polymorphic and condition here, both id maybe the same.

ivan-vdovin commented 2 years ago

@wenerme You should create two cubes from the same one to be able to do that. Here is an example:

Screenshot 2021-12-06 at 14 13 56 Screenshot 2021-12-06 at 14 11 52

Let me know if that helped.

wenerme commented 2 years ago

@ivan-vdovin Thanks, this is what I do

cube(`PhoneCallEventCalleeUsers`, {
  title: '被叫员工',
  extends: Users,
  preAggregations: {},
});
cube(`PhoneCallEventPhoneCallEvent`, {
  title: '主叫员工',
  extends: Users,
  preAggregations: {},
});
cube(`PhoneCallEvents`, {
  sql: `SELECT * FROM phone_call_events`,
  preAggregations: {
  },
  joins: {
    PhoneCallEventPhoneCallEvent: {
      relationship: `belongsTo`,
      sql: `${PhoneCallEvents}.caller_object_id = ${PhoneCallEventPhoneCallEvent}.id`,
    },
    PhoneCallEventCalleeUsers: {
      relationship: `belongsTo`,
      sql: `${PhoneCallEvents}.callee_object_id = ${PhoneCallEventCalleeUsers}.id`,
    },
  },
});

But looking for a better solution, because one table may join same table four or five times, hope support something like this

cube(`PhoneCallEvents`, {
  sql: `SELECT * FROM public.phone_call_events`,
  joins: {
    CreatedByUser: {
      toCude: Users, // tell cubejs the target, do not infer from join name
      relationship: `belongsTo`,
      sql: `${PhoneCallEvents}.caller_object_id = ${Users}.id`,// or detect Users from here
    },
  },
}
)
igorlukanin commented 2 years ago

@wenerme It's a really interesting and clever idea to tell Cube the joining target with an option, thank you! However, I'm not sure if this will allow simplifying the code because you'll still need several join definitions, won't you?

Also, were you able to achieve what you'd like to? Is the complexity of the data schema code your only concern now?

wenerme commented 2 years ago

@igorlukanin Yes, the workaround is working, but

cube(`Users`, {});

cube(`CreatedByUsers`, {
  extends: Users,
});
cube(`UpdatedByUsers`, {
  extends: Users,
});
cube(`DeletedByUsers`, {
  extends: Users,
});
cube(`OwningUsers`, {
  extends: Users,
});

cube(`PhoneCallActivityCalleeUsers`, {
  extends: Users,
});
cube(`PhoneCallActivityCallerUsers`, {
  extends: Users,
});

cube(`PhoneCallActivity`, {
  sql: `SELECT * FROM accounts`,
  joins: {
    CreatedByUsers: {
      relationship: `belongsTo`,
      sql: `${Account}.created_by_id = ${CreatedByUsers}.id`,
    },
    UpdatedByUsers: {
      relationship: `belongsTo`,
      sql: `${Account}.updated_by_id = ${UpdatedByUsers}.id`,
    },
    DeletedByUsers: {
      relationship: `belongsTo`,
      sql: `${Account}.deleted_by_id = ${DeletedByUsers}.id`,
    },
    OwningUsers: {
      relationship: `belongsTo`,
      sql: `${Account}.owning_user_id = ${OwningUsers}.id`,
    },

    PhoneCallActivityCalleeUsers: {
      relationship: `belongsTo`,
      sql: `${Account}.callee_user_id = ${PhoneCallActivityCalleeUsers}.id`,
    },
    PhoneCallActivityCallerUsers: {
      relationship: `belongsTo`,
      sql: `${Account}.caller_user_id = ${PhoneCallActivityCallerUsers}.id`,
    },
  },
});

vs.

cube(`Users`, {});

cube(`PhoneCallActivity`, {
  sql: `SELECT * FROM accounts`,
  joins: {
    CreatedByUsers: {
      toCube: Users,
      relationship: `belongsTo`,
      sql: `${Account}.created_by_id = ${Users}.id`,
    },
    UpdatedByUsers: {
      toCube: Users,
      relationship: `belongsTo`,
      sql: `${Account}.updated_by_id = ${Users}.id`,
    },
    DeletedByUsers: {
      toCube: Users,
      relationship: `belongsTo`,
      sql: `${Account}.deleted_by_id = ${Users}.id`,
    },
    OwningUsers: {
      toCube: Users,
      relationship: `belongsTo`,
      sql: `${Account}.owning_user_id = ${Users}.id`,
    },

    PhoneCallActivityCalleeUsers: {
      toCube: Users,
      relationship: `belongsTo`,
      sql: `${Account}.callee_user_id = ${Users}.id`,
    },
    PhoneCallActivityCallerUsers: {
      toCube: Users,
      relationship: `belongsTo`,
      sql: `${Account}.caller_user_id = ${Users}.id`,
    },
  },
});

This is only for one entity, but there is more, different entity different relationship, cube like PhoneCallActivityCallerUsers, PhoneCallActivityCalleeUsers will not be reused.

I'm not sure how cube rollup is stored, but using more cube may generate duplicate cache, waste storage.

ivan-vdovin commented 2 years ago

@wenerme Thank you for a detailed reply! Actually, pre-aggregations don't depend on the number of cubes but on preaggregation params. As for the approach from your example, it really may help to reduce the amount of code. I've created a ticket in our backlog so the dev team will take a look and think of adding some kind of syntactic sugar for that.

github-actions[bot] commented 2 years ago

If you are interested in working on this issue, please leave a comment below and we will be happy to assign the issue to you. If this is the first time you are contributing a Pull Request to Cube.js, please check our contribution guidelines. You can also post any questions while contributing in the #contributors channel in the Cube.js Slack.