m-kovalsky / fabric_cat_tools

Supercharge your Microsoft Fabric development with the fabric_cat_tools library
MIT License
100 stars 14 forks source link

ERROR: while trying to run get_lakehouse_tables() #13

Closed cvlmonica closed 1 month ago

cvlmonica commented 1 month ago

I am trying to fetch the Microsoft fabric Lakehouse tables and see the list of tables and relevant information to Direct Lake guardrails using the get_lakehouse_tables() . I have mounted a Lakehouse named test_lakehouse to capture the metadata info of all other lakehouses and tried running the below code and also installed the package:

import fabric_cat_tools as fct fct.get_lakehouse_tables( extended = True, count_rows = True)

The above code is running without any problem, and I get to see the output as well.

But when I try to add lakehouse and warehouse parameters which are present in another workspace. So the code looks like the below:

import fabric_cat_tools as fct fct.get_lakehouse_tables( workspace = 'Semantic Model Analyser', lakehouse = 'Semantic_lakehouse', extended = True, count_rows = True)

When I try running this I get the following error : It says the tables present in another lakehouse is not present in the mounted lakehouse named test_lakehouse! image

What is the solution here to fetch the information relevant to lakehouse present in another workspace!? Please guide on this!?

m-kovalsky commented 1 month ago

At this time there is no way to fetch the required information from a lakehouse in a different workspace. This function supports lakehouses from different workspaces as long as exended=False and count_rows=False. Simply run the notebook from the workspace which contains the lakehouse.

cvlmonica commented 1 month ago

Thank you!

cvlmonica commented 1 month ago

I have another doubt! so this guardrails gets info of only lakehouse tables sizes parquet files and all. But there can be directlake semantic models which are built on top of warehouse as well. For that situation, how do I check the warehouse tables , size and all!? any guidance on this!?

m-kovalsky commented 1 month ago

Direct Lake semantic models can only be based on lakehouse delta tables, not warehouse tables.

cvlmonica commented 1 month ago

But in the documentation it says it can be from both! https://learn.microsoft.com/en-us/fabric/get-started/direct-lake-overview#prerequisites

m-kovalsky commented 1 month ago

Lakehouses and warehouses have a default semantic model, which should not be used. Only within lakehouses do you have the ability to create a new semantic model.

cvlmonica commented 1 month ago

But we can create a semantic model on top of the warehouse.

image

m-kovalsky commented 1 month ago

I'm not aware of a way to programmatically identify warehouse tables so I'm not sure how to validate these. I'd recommend using lakehouses to create Direct Lake semantic models as that is the standard method and offers more support.

cvlmonica commented 1 month ago

ok sure! thank you so much for the support!

laweidner commented 1 month ago

when using extended = True, it seems that this will ONLY work if you are not only in the same workspace but also the lakehouse parameter must match the default lakehouse for the notebook.

ie. if extended = True, there is no benefit to using the lakehouse and workspace parameters...

It works fine if extended = false regardless of notebook to lakehouse association.

m-kovalsky commented 1 month ago

See my earlier comment in this thread. https://github.com/m-kovalsky/fabric_cat_tools/issues/13#issuecomment-2115019445