cube-js / cube

📊 Cube — The Semantic Layer for Building Data Applications
https://cube.dev
Other
17.75k stars 1.75k forks source link

Schema Compiler - Column names in Hebrew: support schema generation for non ASCII table names #5174

Open davidfrisch opened 2 years ago

davidfrisch commented 2 years ago

Describe the bug When generating my cube with the schema compiler, column names in Hebrew letters are trimmed and a empty key dimension is generated.

To Reproduce Steps to reproduce the behavior:

  1. Generate a schema with the schema compiler in localhost:4000/playground with Hebrew column names.
  2. You'll notice that generated dimensions key's that suppose to have Hebrew letters are empty.
cube(`TableNameWithHebrewColumnName`, {
  sql: `SELECT * FROM "TableName"."TableNameWithHebrewColumnName"`,

  preAggregations: {
    // Pre-Aggregations definitions go here
    // Learn more here: https://cube.dev/docs/caching/pre-aggregations/getting-started  
  },

  joins: {

  },

  dimensions: {
    formNumber: {
      sql: `${CUBE}."Form_Number"`,
      type: `number`
    },

    : {
      sql: `${CUBE}.`,
      type: `string`,
    },
  },

  dataSource: `default`
});

Expected behavior To have an object with hebrew letters.

expected :  "גיל " : {
                       sql: `${CUBE}.גיל `,
                       type: `string`,
                  } 

Version: @cubejs-backend/schema-compiler : 0.30.45

paveltiunov commented 2 years ago

Hey @davidfrisch ! Thanks for posting it! Cube Schema doesn't allow to have non-ASCII characters in member names, so the schema generator tries to convert them to ASCII characters. What do you think will be the best way to convert those to ASCII characters?

github-actions[bot] commented 2 years ago

If you are interested in working on this issue, please leave a comment below and we will be happy to assign the issue to you. If this is the first time you are contributing a Pull Request to Cube.js, please check our contribution guidelines. You can also post any questions while contributing in the #contributors channel in the Cube.js Slack.

davidfrisch commented 2 years ago

Thank you for your reply.

I believe that restricting to only ASCII characters restricts access to multi-languages like Hebrew, Russian or Japanese letters.

Perhaps moving to Unicode would be a solution. What do you think about it?

I was considering adding a parameter in the cube.js file. That will specify the language for each DB source.

 return new POSTGRESQL({
        server:"localhost",
        database: "my_db",
        user: "postgres",
        password: "postgres",
 =>     dbIsUnicode: true,
        port: 5432,
    })