sql-bi / VertiPaq-Analyzer

MIT License
75 stars 13 forks source link

Very slight difference in numbers between 1.93 & 2.0 #6

Closed dbszepesi closed 5 years ago

dbszepesi commented 5 years ago

I do see a difference in some of the cardinality breakdowns between the two but they appear VERY small:

(Images removed by OP since issue closed)

marcosqlbi commented 5 years ago

Are you testing Analysis Services or Power BI? If you use Analysis Services, do you have partitions? In both cases, what is the result if you run a query in DAX Studio such as: EVALUATE ROW ("x", DISTINCTCOUNT ( ForecastByProduct_Comm[Mapping] ) ) (and similar for other columns that are different) Thanks!

Marco

dbszepesi commented 5 years ago

Sorry, should have specified. Power BI.

So Dax Studio (Running distinctcount) matches the 2.0 results.

Mapping, for instance, shows 368 in 1.93 but 369 in 2.0 and in Dax Studio.

EventName is 79 in 1.93, 80 in both 2.0 and Dax Studio.

marcosqlbi commented 5 years ago

Is this a table on the one/side of a dimension?

dbszepesi commented 5 years ago

Its a fact table - many side.

marcosqlbi commented 5 years ago

Interesting. The difference is the following: in 1.93 we use a DMV to get the size of the dictionary. In 2.00 if possible we use DISTINCTCOUNT over the column. A difference could be possible just after a refresh, with DMV returning a larger number that is fixed as soon as you reload the database in memory. Another difference could be that the dictionary is larger because of fragmentation (incremental update and partitions). Finally, a difference of 1 could be caused by the additional blank row on the on-side of a relatioship. None of these events could be applied here. So this is strange. Sorry for the question, but did you double check you ran the VPA 1.93 on the same database? If you made a refresh in the meantime data could have changed. If you confirm this is what you get from the same database, I don't have other explanations - unless a BLANK row is not stored in the dictionary - which could be possible and I never checked that. Let me know your findings, thanks!

dbszepesi commented 5 years ago

This model was created and last refreshed on another machine. I downloaded it to this machine to use the multiple different tools, etc. It has not been refreshed on this machine since copying it over.

Question - I am messing about with the DMV's in Dax Studio and am finding what looks like the different value (i.e. the 1.93 value).

For instance, in

select * from $SYSTEM.TMSCHEMA_COLUMN_STORAGES

Statistics_DistinctStates column for mapping shows 368, which is the 1.93 version. So the DMV's in Dax Studio show the 1.93 value but running distinctcount shows 369, which is the 2.0 value.\

Can you give me a more exact DMV call to test in Dax Studio? I really am fumbling about in the dark.

image

marcosqlbi commented 5 years ago

We use $SYSTEM.DISCOVER_STORAGE_TABLES for compatibility with 1103 models, but I guess the number are the same. Can you try this?

EVALUATE CALCULATETABLE ( ROW ("x", DISTINCTCOUNT ( ForecastByProduct_Comm[Mapping] ) ), NOT ISBLANK ( ForecastByProduct_Comm[Mapping] )

dbszepesi commented 5 years ago

That gave me a result of 368.

marcosqlbi commented 5 years ago

Ok so now I have an explanation! The dictionary never counts the blank value. Before 1.93 only uses the dictionary, it always ignore the blank value. If a column contains a blank value, there is a difference of 1 in the column. I would close the case as "by design" :-) Thank you very much for your help!