microsoft / Analysis-Services

Git repo for Analysis Services samples and community projects
MIT License
608 stars 417 forks source link

BPARules : Filter column and measure does not work properly #224

Open morelgeorge opened 1 year ago

morelgeorge commented 1 year ago

Hello, from my point of view following rules does not work properly:

In case the syntax of CALCULATE or CALCULATETABLE is simple, then it works fine. But let's imagine I have this piece of DAX expression: CALCULATE ( SUMX ( Sales, Sales[Net Price] * Sales[Quantity] ), FILTER( Sales, Sales[Currency Code] = "USD" ) )

or

CALCULATE ( SUMX ( Sales, Sales[Net Price] * Sales[Quantity] ), FILTER ( Sales, [NetSalesPerOrderDate] > 100 ) )

Then both rules are not violated, but I think they should be. Because of that I suggest to update rule definitions.

Filter column values with proper syntax

RegEx.IsMatch(Regex.Replace(Expression, "\s", String.Empty),"(?i)CALCULATE\(.*(?i),(?i)FILTER(?i)\([']?.+[']?,[']?.+[']?\[.*\].*") or RegEx.IsMatch(Regex.Replace(Expression, "\s", String.Empty),"(?i)CALCULATETABLE\(.*(?i),(?i)FILTER(?i)\([']?.+[']?,[']?.+[']?\[.*\].*")

Filter measure values by columns, not tables

RegEx.IsMatch(Regex.Replace(Expression, "\s", String.Empty),"(?i)CALCULATE\(.*(?i),(?i)FILTER(?i)\([']?.+[']?,\[.*\].*") or RegEx.IsMatch(Regex.Replace(Expression, "\s", String.Empty),"(?i)CALCULATETABLE\(.*(?i),(?i)FILTER(?i)\([']?.+[']?,\[.*\].*")

The idea is the same for both rules. At first I replace all the whitespaces from the DAX expression. Then I updated Regex to search. I know that the rule is violated when the FILTER is applied at Filter parameter of CALCULATE or CALCULATETABLE functions. The Filter parameter is always the second parameter for both functions. So I know that preceding character is a comma. And that is exactly what this regex string is doing.

I did some testing on my test dataset and it has been working fine. But I will appreciate any feedback from your side.

Regards, Jiri

otykier commented 1 year ago

Hi Jiri. Without a syntax tree representing the DAX expression, I think it's going to be very difficult to create a rule like this that works in the general case. For example, your rule does not take into account that the FILTER table could have been defined in a variable, or the first argument of FILTER could be another table function, etc. etc. Using Regex alone, you would basically need an infinitely large expression to account for all situations.

However, someone told me that this is the kind of thing DAX Optimizer intends to solve :)

morelgeorge commented 1 year ago

Hello, thank you for your reply. I agree that my update does not take into account with all the cases you mentioned. I believe there is always a space to violate rules like this. :-) But still I think the rule definition is slightly better than it is in current version of BPARules. What do you think?

Thank you for DAX optimizer hint. I signed to waitlist and can't wait to check!