Closed fabrice-etanchaud closed 3 years ago
Hi, @fabrice-etanchaud , You can run the following sql in dremio to get the schema lists:
select distinct "TABLE_SCHEMA" from "INFORMATION_SCHEMA"."TABLES"
I think any one can work, you can try using the built-in one: "sys" or "INFORMATION_SCHEMA"
And, I think this required connection field is not needed at all, since Metabase will still show out all schemas no matter what you entered here.
I think I can remove this "database" field tomorrow
Hello @Baoqi , thank you for providing us with a new dremio driver ! Please, could you tell me what to put in the "database" required connection field ? I tried with several space names, with no luck.
Looking forward, Fabrice
I just tried to input a wrong "database" field ( just some random text), but my Metabase environment can work correctly (even with the wrong value).
So, this issue may due to other problems , maybe you can put your dremio's version & metabase's version here, and also the full error messages, so that I can try to guess where maybe wrong
What I noticed is that the database field is used only at connection time. Afterwards, you can select from any tables across different databases. Notice that in Dremio terminology, database is Data lakes. So just define a dummy one, use it to connect Metabase to Dremio, and then you will have access to mostly everything. Assuming a single user setup.
Hi all ! This is great ! In my use case, there are a dozen relational datasources with dozen of schemas and hundreds of tables/views. So the catalog is breaking, I cannot update column types and dependencies... Given that dremio is a federation tool, possibly connected to a huge number of sources, do you think it could be possible to restrict metabase access to objects under the connection "database" ?
Hi, all,
Based on previous discussion, I make a new release today. https://github.com/Baoqi/metabase-dremio-driver/releases/tag/1.0.1
The changes are:
If we specify one schema for "schema", then Dremio will only try to discover that specified schema, it will be much faster in my test.
Hope this can work in your environment, and the change from "dbname" to "schema" won't break existing Dremio connections.
Thanks.
Thank you @Baoqi , yes this is much faster !
One thing : it seems the filter is an exact match, doesn't it ? Don't you think it would be valuable to have a LIKE 'schema%' instead, in order to catalog all the vds/pds in subfolders ?
Thank you again !
Thank you @Baoqi , yes this is much faster !
One thing : it seems the filter is an exact match, doesn't it ? Don't you think it would be valuable to have a LIKE 'schema%' instead, in order to catalog all the vds/pds in subfolders ?
Thank you again !
Hi, @fabrice-etanchaud , You're Welcome!
Yes, that's a good suggestion. I have 2 small questions about this change:
select distinct "TABLE_SCHEMA" from "INFORMATION_SCHEMA"."TABLES" where "TABLE_SCHEMA" in ('schema1', 'schema2') or "TABLE_SCHEMA" like 'schema3%'
we can execute this dremio SQL, and get a list of all matched schema names. By this, we can select any schema as we like, but this may be too hard to use for users, so this may not be needed?
Thanks
I really appreciate your responsiveness !
You are right, given that dremio is a federation tool, there must be a way to configure the list of schemas of interest, in ordre to calm metabase. For the moment, I would be more than happy to have recursively all datasets under a given dremio path ! IMHO Metabase would be plugged in the business marts layer, so one can suppose all related data to be found recursively under a specific dremio path (facts, dimensions and the like). And nothing prevent us from creating multiple isolated metabase connections on different paths. What do you think of it ?
@Baoqi , did you have time to have a look at the dbt-dremio project ? Would you find it valuable ?
I really appreciate your responsiveness !
You are right, given that dremio is a federation tool, there must be a way to configure the list of schemas of interest, in ordre to calm metabase. For the moment, I would be more than happy to have recursively all datasets under a given dremio path ! IMHO Metabase would be plugged in the business marts layer, so one can suppose all related data to be found recursively under a specific dremio path (facts, dimensions and the like). And nothing prevent us from creating multiple isolated metabase connections on different paths. What do you think of it ?
@Baoqi , did you have time to have a look at the dbt-dremio project ? Would you find it valuable ?
Hi, @fabrice-etanchaud , I just make a new small release 1.0.2. https://github.com/Baoqi/metabase-dremio-driver/releases/tag/1.0.2 , which add a support to make "schema" field works as prefix. I did some basic tests on my local dev env (which only have small number of tables), it worked fine.
Yes, I'm very interested in dbt related projects, (although I haven't got time to carefully research dbt project yet), I think dbt-dremio should be very helpful, I will keep an eye on your project. Thanks.
Just to tell you that it works like a charm !
Now nothing can stop us using metabase query builder on a mixture of nosql, rdbms and json, csv or excel files !
Thank you, have a good week end !
Just to tell you that it works like a charm !
Now nothing can stop us using metabase query builder on a mixture of nosql, rdbms and json, csv or excel files !
Thank you, have a good week end !
Thanks. Glad it works.
Now, since it works, let's close this issue.
Hello @Baoqi , thank you for providing us with a new dremio driver ! Please, could you tell me what to put in the "database" required connection field ? I tried with several space names, with no luck.
Looking forward, Fabrice