imfrancisd commented 9 years ago

Suggestion

Add an "Accuracy" property to DiagnosticRecord that will state the probability of the diagnostic being correct.

Benefits for the Users

Users can filter and sort diagnostics based on their accuracy.

    invoke-scriptanalyzer |
        where {$_.accuracy -gt 0.75} |
        sort rulename, accuracy -descending

Benefits for the Rule Writers

Rule writers can introduce new rules sooner by introducing rules with a low accuracy, and then later increase the accuracy as the rule is refined.

For example, the "UseShouldProcessForStateChangingFunctions" rule right now is based only on the function name, so the rule can be assigned with an accuracy of, let's say, 0.25. If the rule is improved with better heuristics, then its accuracy can be increased. If the default of Invoke-ScriptAnalyzer doesn't show diagnostics with an accuracy of less than 0.50, then users won't have to suppress the "UseShouldProcessForStateChangingFunctions" while the rule is still in its initial stages.

This allows rule writers to get feedback on their rules while the rules are still being refined without adding noise to Invoke-ScriptAnalyzer results.

Benefits for the Rules

The rules can have multiple diagnostic accuracies.

For example, the "PossibleIncorrectComparisonWithNull" will be more useful if it can output diagnostics with different accuracies based on context.

A line such as:

    if ($a -ne $null)

should output a diagnostic for this rule with a high accuracy because "$a -ne $null" is very likely to be a compare operation instead of a filter operation.

However, a line such as:

    $a = $b -ne $null

should output a diagnostic for this rule with a low accuracy because "$b -ne $null" can either be a compare operation or a filter operation, and it will be very difficult to assert that the user violated the rule.

Now, the rule doesn't have to be hardcoded with the compromise of when to generate diagnostics because it can generate all diagnostics with different accuracy levels.

imfrancisd commented 9 years ago

Suggestion Amendment

The values for the "Accuracy" property should not be numbers, but words like "Low", "Medium", and "High".

Since the accuracy of a diagnostic will be very subjective (made up by the rule writer, voted on by the community, etc.) using numbers will make the property seem objective when it is not. That could lead to pointless arguments, such as "that diagnostic should have an accuracy of 0.75612, not 0.75611!"

Also, using a small set of values will prevent the "race" between the users who want to work around the tool and the tool makers who want to prevent the users from working around the tool. For example, users who want to work around the tool may only display diagnostics with an accuracy higher than, say, 0.75. Tool makers who do not want their diagnostics ignored will increase the accuracy of their diagnostics to 0.76. And the race begins. However, with a small set of values, the race ends before it can even begin.

Core idea

The main idea of this suggestion is just to give users and rule makers the ability to distinguish between diagnostics that are likely to be correct and diagnostics that can sometimes be wrong.

This way, there is less of a need to suppress or compromise the rules.

KirkMunro commented 9 years ago

I don't like this idea personally. Accuracy works for something like OCR where there is a very high amount of randomness to the data and as a result very large margin for error. In the case of rules in this tool though, the margin for error is not that large. Rules either have bugs that need to be fixed or rules are identified as unnecessary and become candidates for removal. Trying to add accuracy to these rules is only going to over-complicate things IMHO.

imfrancisd commented 9 years ago

@KirkMunro

Rules either have bugs that need to be fixed or rules are identified as unnecessary and become candidates for removal. Trying to add accuracy to these rules is only going to over-complicate things IMHO.

There's a big difference between the rules and the diagnostics, and this suggestion is about the diagnostics.

Examples

Here are a few examples that demonstrates where I'm coming from.

Bad Rule, Good Diagnostics

I think we both agree that the way the "ProvideVerboseMessage" rule is written right now is bad. It's a bad rule because it just encourages everyone to write verbose messages for all advanced functions.

However, the diagnostic for this rule is very good. There are no false positives. If you see a diagnostic for this rule, you know that there is an advanced function that does not have a verbose message.

However, the rule is still bad.

Good Rule, Bad Diagnostics

The "UseShouldProcessForStateChangingFunctions" is a good rule because it gives users of functions that change system state the chance to do -WhatIf.

However, the diagnostic for this rule is really bad. It generates many false positives because it only looks at the name of the function (the verb in particular) in order to determine that the function changes state. Since the name doesn't determine whether or not the function changes system state, the diagnostic is wrong a lot of times.

However, the rule is still good.

Buggy/Unnecessary Rules

I completely agree that buggy rules should be fixed and that unnecessary rules need to be removed.

This suggestion is only for the diagnostics of rules that are deemed worthwhile.

False Positives are Unavoidable

I think false positives can never be eliminated from this tool because this fully automated tool deals with the subjective topic of Best Practices for a language that was not designed for static analysis.

If this tool only deals with things like indentation and aliases, then yes, we can get rid of false positives. However, it will also make the tool completely useless because those things are already handled by the ISE and scripts/tools that automatically formats scripts.

The interesting problems, like state changing functions, or comparison vs. filtering operations, or even functions accidentally returning unintended objects, are real problems that cannot have diagnostics that are always 100% correct because those things require a little bit of mind reading. Some of those problems require more mind reading than others and will have diagnostics that are less accurate than others.

The Only Human in the System

Years from now, after the rules have been written and this tool is used by many people, the only thinking human in the system will be the one who has to deal with the diagnostics.

Automatic Prioritization of Work

An "Accuracy" property on the diagnostics, or something similar, will give that person another piece of information that he/she can use to automatically prioritize work. That person can do something like:

    $analyzerResults | where {$_.accuracy -eq 'high'}

and have some confidence that he/she will not be chasing dead ends.

Defense

More importantly, an "Accuracy" property on the diagnostics can give that person a reason for disagreeing with a diagnostic without fear of being labeled as a rebellious cowboy.

For example, the scripter can say something like:

I'm not following this diagnostic because it doesn't make sense in this case. You can even see that the diagnostic has a low accuracy. I follow it when I can, but for this case, it's just wrong.

Without such a property, how can anyone disagree with a tool created by Microsoft in an open source project that is based on Best Practices created by PowerShell experts?

Even if a scripter gathers up the nerve to tell their non-technical peers and bosses (both with and without pointy hairs) that he/she will not follow the diagnostics from this tool, how can the scripter convincingly explain why he/she is right and the tool is wrong?

If the accuracy of the diagnostic is right there in the output, then the scripter can just point in the screen and say, "this is one of those times where the diagnostic is wrong." Without that, he/she just risks being branded as a rebellious, irresponsible, cowboy.

rkeithhill commented 9 years ago

The values for the "Accuracy" property should not be numbers, but words like "Low", "Medium", and "High".

I like this tweak and I kind of like the notion overall. Not a must have for me but I can see the value.

PowerShell / PSScriptAnalyzer

I suggest adding an "Accuracy" property to DiagnosticRecord. #234

Suggestion

Benefits for the Users

Benefits for the Rule Writers

Benefits for the Rules

Suggestion Amendment

Core idea

Examples

Bad Rule, Good Diagnostics

Good Rule, Bad Diagnostics

Buggy/Unnecessary Rules

False Positives are Unavoidable

The Only Human in the System

Automatic Prioritization of Work

Defense