Closed cvlmonica closed 6 months ago
At this time there is no way to fetch the required information from a lakehouse in a different workspace. This function supports lakehouses from different workspaces as long as exended=False and count_rows=False. Simply run the notebook from the workspace which contains the lakehouse.
Thank you!
I have another doubt! so this guardrails gets info of only lakehouse tables sizes parquet files and all. But there can be directlake semantic models which are built on top of warehouse as well. For that situation, how do I check the warehouse tables , size and all!? any guidance on this!?
Direct Lake semantic models can only be based on lakehouse delta tables, not warehouse tables.
But in the documentation it says it can be from both! https://learn.microsoft.com/en-us/fabric/get-started/direct-lake-overview#prerequisites
Lakehouses and warehouses have a default semantic model, which should not be used. Only within lakehouses do you have the ability to create a new semantic model.
But we can create a semantic model on top of the warehouse.
I'm not aware of a way to programmatically identify warehouse tables so I'm not sure how to validate these. I'd recommend using lakehouses to create Direct Lake semantic models as that is the standard method and offers more support.
ok sure! thank you so much for the support!
when using extended = True, it seems that this will ONLY work if you are not only in the same workspace but also the lakehouse parameter must match the default lakehouse for the notebook.
ie. if extended = True, there is no benefit to using the lakehouse and workspace parameters...
It works fine if extended = false regardless of notebook to lakehouse association.
See my earlier comment in this thread. https://github.com/m-kovalsky/fabric_cat_tools/issues/13#issuecomment-2115019445
I am trying to fetch the Microsoft fabric Lakehouse tables and see the list of tables and relevant information to Direct Lake guardrails using the get_lakehouse_tables() . I have mounted a Lakehouse named test_lakehouse to capture the metadata info of all other lakehouses and tried running the below code and also installed the package:
import fabric_cat_tools as fct fct.get_lakehouse_tables( extended = True, count_rows = True)
The above code is running without any problem, and I get to see the output as well.
But when I try to add lakehouse and warehouse parameters which are present in another workspace. So the code looks like the below:
import fabric_cat_tools as fct fct.get_lakehouse_tables( workspace = 'Semantic Model Analyser', lakehouse = 'Semantic_lakehouse', extended = True, count_rows = True)
When I try running this I get the following error : It says the tables present in another lakehouse is not present in the mounted lakehouse named test_lakehouse!
What is the solution here to fetch the information relevant to lakehouse present in another workspace!? Please guide on this!?