sql-bi / VertiPaq-Analyzer

MIT License
76 stars 13 forks source link

Question: Column Hierarchies filtered to segment_number = 0 #108

Open m-kovalsky opened 10 months ago

m-kovalsky commented 10 months ago

Why does the logic for column hierarchies filter to segment_number = 0? (see from line 830 in the code below)

https://github.com/sql-bi/VertiPaq-Analyzer/blob/a9df144290a47f56867457bde40c80bf9f5bfcef/src/Dax.Model.Extractor/DmvExtractor.cs#L839

marcosqlbi commented 10 months ago

The sad reality is that I don't remember. I guess that segment 1 (which is the only one that appears) is not significant most of the time; this way, we didn't have to aggregate rows with a minimal impact on accuracy. But it's my guess reviewing the code after so many years.

m-kovalsky commented 10 months ago

Thanks for the reply, Marco. Yes, I noticed that segment 0 contains a relatively high value and segment 1 contains a small value. Would it be more accurate to do a sum of these values? I wouldn't think it would be too much of a burden on performance.

marcosqlbi commented 10 months ago

Does it really matter? :) I'm not even sure that the value displayed is meaningful. If you can ask Akshai, he might know whether it is worth the effort (in the long term, we might also have to change the implementation by using the new INFO functions in DAX...)

m-kovalsky commented 10 months ago

Not sure how much it matters. Just was looking at the code and thought it was odd that it had that filter. Hmm, interesting about the new info functions. It would be great to have dax.guide updated with syntax examples for these functions. For instance, how to select specific columns and place filters on them (and of course the basic 'SELECT *' code below).

EVALUATE INFO.TABLES()

marcosqlbi commented 10 months ago

The problem is always the time! Are you willing to contribute? Send me examples by email, I'll include them!

m-kovalsky commented 10 months ago

At this point, it wouldn't be possible to replace DMVs with INFO functions for vertipaq analyzer as the INFO functions is missing the DISCOVER_STORAGE_TABLE_COLUMN_SEGMENTS DMV with necessary info (at least I haven't seen it). Not sure if this is part of a future plan. Also, I'm not sure what the advantage is for using the INFO functions as they extract the same data as the DMVs. Perhaps one benefit would be using COLUMNSTATISTICS() as it has useful info with no parsing needed.

marcosqlbi commented 10 months ago

I guess that the reason is permissions. It's likely you can invoke the INFO functions more easily (lower permissions) than DMVs going forward. But I don't know much more, Microsoft just published these functions without much additional info. We look forward to getting more news. I agree that we'll not change anything (also for compatibility reasons) for a while...

m-kovalsky commented 10 months ago

Ah good point on permissions. That makes sense. I think just that DISCOVER DMV and something showing table row counts would be all that's needed for vertipaq analyzer to use these new functions.