Closed hmayer1980 closed 6 months ago
While I have been working on this for a while, more and more I get to the considerationt hat the "Vertipaq Analyser" Results should be selfcontained by the PowerBI engine. It would not do anything with Spark. I wanted to get the Memory Foot of all models on our Capacity - but the Models are not fully functional. the DAX Studio do not create if a column references a column that does not exist in the Lakehouse / Warehouse. If outputs the memory from the model - what is loaded. I on the other hand now just find Errors in the Models because the columsn where never used - but are in the model - and the Spark Queries do fail because of that. a Validation is good - but not in the Veripaq Analyzer Function - there I want to get the Memory footprint as its.
This was actually on my to do list already. A new parameter will be added to the vertipaq_analyzer function called 'read_stats_from_data'. It will default to False. Setting it to True will use spark to query the lakehouse (for Direct Lake models) or use DAX (for non-Direct Lake models) to obtain values for Column Cardinality and Missing Rows.
Added to 0.3.2
Is your feature request related to a problem? Please describe. if you run vertipaq analyzer - the dax studio supports a setting called "read statistics from data" in the options which prevents reading actual data (from Direct Query or Direct Lake) models. As it was noted in the function signature it is necessary to not query the model otherwiese all columns will be loaded into memory. But also querying with Spark can introduce a very long runtime / cost if those queries are run on the lakehouse via spark. I need to wait for 20+ Minutes, while I am actually only interested in the current memory usage of the model (and not so much of not loaded columns)
Describe the solution you'd like Please introduce new parameter of generic configuration to disable the Querying of vertipaq analyzer column statistics via Spark.
Describe alternatives you've considered An alternative would be to just not do that - as dax studio supports.
Additional context See problem description