PoshCode / PowerShellPracticeAndStyle

The Unofficial PowerShell Best Practices and Style Guide
https://poshcode.gitbooks.io/powershell-practice-and-style
Other
2.24k stars 288 forks source link

Consider revising PERF-02 #148

Open jdtrouble opened 3 years ago

jdtrouble commented 3 years ago

I have some suggestions for improving "PERF-02 Consider trade-offs between performance and readability". I hope it's OK to group multiple "issues" here rather than spamming the Issues section.

The first issue, the page directs a user to drop into using a .NET framework to do something (basically read a file) when a PowerShell cmdlet already exists to do the same thing. The page actually called out the appropriate cmdlet a moment before. This is in contradiction to other advice that, generally, you should avoid using .NET or external tool when PowerShell can effectively do the same thing.

Second, the page uses the .NET code to do something that PowerShell handles natively with the Pipeline. I use Powershell a lot in my professional career, and I can say that 80% of time I'm handling collections of objects. Which computers in this list have x.yy version of Audio Driver? Which AD users are locked and/or password expired? The pipeline is king, the king is dead, all hail the king.

Get-Content -Path file.txt | ForEach-Object -Process { <#...#> } Of the options in the article, this is absolutely the best way to approach the input.

Heck, even "one object" collections circumstantially benefit from going through the pipeline. Consider the scenario, I want widget.exe to not run. For the purpose of this example I expect at most one instance of wdiget.exe to run. And maybe it's not running presently, I don't care. But I certainly want to stop, if it is running. Stop-Process -Name widget This will generate an error if widget.exe is not running. I don't want an error message in this case. If I'm stopping on errors, this will break the script.

Get-Process -Name widget | Stop-Process Even if there is never 2 or more instances of widget.exe, this is the best way to handle the use case (and IMO, it looks great). If the process is not running, then nothing gets passed to Stop-Process. Rather than generating an error, Stop-Process just sits there and twiddles its thumbs.

Less important but worth mentioning, the page uses what Wikipedia would call "Weasel words." See: https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style/Words_to_watch#Unsupported_attributions Examples, "many folks in the community", "Most folks", "Some would argue". Who are these people, and why should I care? I was sysadmin at a factory and I love the assembly line approach to my control scripts. I often script in this format:

param (
    # Bunch of variables I might want to customize in the future
)

# in production, functions actually reside in a module psm1 file I wrote, which I import here
function Verb-Stuff {
    param ( 
        [parameter(Mandatory,ValueFromPipeline)][type]$Object
        # Other parameters ...
    )
    Process {
        try {
            # Do stuff ...
        } 
        catch {
            # error handling
        }
    }
}

Get-Stuff -Property $Parameter | Where-Object { <#filter stuff#> } | Verb-Stuff | Out-Stuff 

I find the above to be very aesthetically pleasing, and easy to follow. I understand that not everyone will agree, but honestly I don't care. Aesthetics are nice when talking about tab lengths, max line lengths, NamingConventions, etc. When you are writing code in production, "aesthetics" are not a good reason to change the function of the code. To do so risks unplanned problems.

Jaykul commented 3 years ago

I agree about the weasel words. This book was originally written as a collaboration between a lot of people in the PowerShell community, and when there are weasel words it's literally a concession that there wasn't unanimous agreement between the authors, or among those polled. It's been a number of years now, and we should revise most of this -- I think we've built significantly more consensus in some areas, and in some areas, it's out of date with current practice.

However, for the main portion of your suggestion, I really don't understand what you're getting at. The purpose of this chapter and this section in particular is that "while everyone agrees that aesthetics are important - they help make scripts more readable, more maintainable, and so on - performance can also be important." In particular, the final example is there to demonstrate that even if you are focused on performance above all, you should still consider the readability and maintainability too (for which Don used the word "aesthetics").

When you reach this point in the Best Practices we have not discussed a lot about style, but that's because there's a whole Style Guide section dedicated to that. However, this chapter is specifically talking about the trade-off between performance and aesthetics. That's why it starts with the "for example" and moves to a "alternate approach" that uses the pipeline effectively, and finally to "a lower-level .NET Framework approach."

You may not care about performance at all. That's fine for now. Let's be clear though: a lot of people do, and must. In my opinion, the speed of execution is at least equally important as the maintainability of the code. Maintenance is done rarely -- and by relatively few people, but performance affects every usage of your code by every user.

The truth about performance of PowerShell code is that using language features (like the foreach(){} keyword) is faster than calling commands, and therefore, calling classes and methods directly instead of via functions or cmdlets is faster. In this example, for instance, there can be no doubt that the .net native approach is faster than the native PowerShell syntax:

image

In fact, you'll find that as long as the file will fit in memory without problems, the first approach using the language foreach without a pipeline is always faster than the pipeline using ForEach-Object, but never faster than using a StreamReader.