chocolatey / choco

Chocolatey - the package manager for Windows
https://chocolatey.org
Other
10.26k stars 900 forks source link

installers that 'SetEnvironmentVariable'(s) from a service mess up corresponding service variable context #1668

Open mwallner opened 5 years ago

mwallner commented 5 years ago

what did I see?

after installing a package (Boxstarter 2.12.0) from a jenkins job, all other jobs that followed failed because powershell.exe was "broken" in that service session.

why am I seeing this?

as a part of it's install behavior, Boxstarter modified $env:PSModulePath - for both 'User' and 'Machine' level. (see Boxstarter setup.ps1) this is totally fine when run from a user session, but according to microsoft support, environmental variable changes will not be affected until system reboot. (which is weird, because in my case a service-restart seemed to do the trick also).

WHAT?

As a result of changing $PSModulePath, whenever a jenkins-job executed powershell.exe it quickly failed because "basic features" - such as "Split-String", "Join-Path" etc. are missing. you'll see messages like The term 'Split-String' is not recognized as the name of a cmdlet, function, script file, or operable program...

And that's where I got completely confused: if setting environmental variables from a service process only get re-read, it should at least not influence other processes that are started after that initial process finished. In my case the $PSModulePath was neither the old, nor the new value it should be - but rather something else.

backing log data

I've added a bunch of debug-printlns right before and after installing that upgrade + before starting my failing jobs, here are the results:

before upgrading Boxstarter:

--- printDebugVar ---
PSModulePath -    User: C:\Users\builduser\Documents\WindowsPowerShell\Modules;C:\Users\builduser\AppData\Roaming\Boxstarter
PSModulePath - Machine: C:\Program Files (x86)\PowerShell Community Extensions\Pscx3\;C:\Program Files\WindowsPowerShell\Modules;C:\Windows\system32\WindowsPowerShell\v1.0\Modules
PSModulePath -     env: C:\Users\builduser\Documents\WindowsPowerShell\Modules;C:\Users\builduser\AppData\Roaming\Boxstarter;C:\Program Files (x86)\PowerShell Community Extensions\Pscx3\;C:\Program Files\WindowsPowerShell\Modules;C:\Windows\system32\WindowsPowerShell\v1.0\Modules
---------------------

after upgrading Boxstarter: (same Job/Process as above)

--- printDebugVar ---
PSModulePath -    User: C:\Users\builduser\Documents\WindowsPowerShell\Modules
PSModulePath - Machine: C:\ProgramData\Boxstarter;C:\Program Files (x86)\PowerShell Community Extensions\Pscx3\;C:\Program Files\WindowsPowerShell\Modules;C:\Windows\system32\WindowsPowerShell\v1.0\Modules
PSModulePath -     env: C:\Users\builduser\Documents\WindowsPowerShell\Modules;C:\Users\builduser\AppData\Roaming\Boxstarter;C:\Program Files (x86)\PowerShell Community Extensions\Pscx3\;C:\Program Files\WindowsPowerShell\Modules;C:\Windows\system32\WindowsPowerShell\v1.0\Modules
---------------------

after upgrading Boxstarter: (new Job/Process, Jenkins-Service NOT restarted)

--- printDebugVar ---
PSModulePath -    User: C:\Users\builduser\Documents\WindowsPowerShell\Modules
PSModulePath - Machine: C:\ProgramData\Boxstarter;C:\Program Files (x86)\PowerShell Community Extensions\Pscx3\;C:\Program Files\WindowsPowerShell\Modules;C:\Windows\system32\WindowsPowerShell\v1.0\Modules
PSModulePath -     env: C:\Users\builduser\Documents\WindowsPowerShell\Modules;C:\Users\builduser\AppData\Roaming\Boxstarter
---------------------

now here is the fun part: where does $env:PSModulePath come from? - it's C:\Users\builduser\Documents\WindowsPowerShell\Modules;C:\Users\builduser\AppData\Roaming\Boxstarter

in my tests, $env:PSModulePath has always been a concat of [Environment]::GetEnvironmentVariable('PSModulePath', 'Machine') and [Environment]::GetEnvironmentVariable('PSModulePath', 'User')

also, when querying from a user-session on the very same machine, $env:PSModulePath still looks ok.

after upgrading Boxstarter: (new Job/Process, Jenkins-Service was restarted)

--- printDebugVar ---
PSModulePath -    User: C:\Users\builduser\Documents\WindowsPowerShell\Modules
PSModulePath - Machine: C:\ProgramData\Boxstarter;C:\Program Files (x86)\PowerShell Community Extensions\Pscx3\;C:\Program Files\WindowsPowerShell\Modules;C:\Windows\system32\WindowsPowerShell\v1.0\Modules
PSModulePath -     env: C:\Users\builduser\Documents\WindowsPowerShell\Modules;C:\ProgramData\Boxstarter;C:\Program Files (x86)\PowerShell Community Extensions\Pscx3\;C:\Program Files\WindowsPowerShell\Modules;C:\Windows\system32\WindowsPowerShell\v1.0\Modules
---------------------

my question(s)

ferventcoder commented 5 years ago

Can you also put in a full repo? Steps are helpful.

ferventcoder commented 5 years ago

why does a restart of the service alone solve the problem when microsoft support states you have to reboot the machine

This actually makes sense to me - I was hoping it was just a restart of the service and not the machine.

ferventcoder commented 5 years ago

after upgrading Boxstarter: (new Job/Process, Jenkins-Service NOT restarted)

It looks like it is carrying the old USER only environment variables into process and ignoring machine variables.

Before upgrade (emphasis mine):

After upgrade (new job/process without restart):

ferventcoder commented 5 years ago

It seems like there is a bug somewhere in Update-SessionEnvironmentVariables that is triggered by a service call. We also have the element of subprocesses (child processes being called) that never persist changes to environment data back up the stream - this is a known limitation of Windows. You can't add something to the PATH in a subprocess and have the parent process see it without something that helps it see that - but refreshenv.cmd / Update-SessionEnvironment are the hacks that allow for this behavior to occur.

Can a simple repo be a powershell script that does the following:

Then run this through a service process to see the behavior.

mwallner commented 5 years ago

this bug seemingly doesn't exist or behaves differently from a Docker (microsoft/windowsservercore:ltsc2016) instance..

I've create a little demo PS-script that should be able to cause the effect on Win 10 LTSB 1706 with Jenkins running as a Service. edit: link to ps-gist: https://gist.github.com/mwallner/5736ad119398decbe538bbbcbd1b8978

mwallner commented 5 years ago

I've run this script twice on the same node and the error occured again. as @ferventcoder suggested, it's really the case, that the old USER environent is set to env:PSModulePath, the Machine environment only gets reload/updated once the service is restarted.

see https://gist.github.com/mwallner/5736ad119398decbe538bbbcbd1b8978#gistcomment-2750787 vs https://gist.github.com/mwallner/5736ad119398decbe538bbbcbd1b8978#gistcomment-2750788

mwallner commented 5 years ago

Adding Update-SessionEnvironment did not change anything, $env:PSModulePath still points to the old USER environment

see https://gist.github.com/mwallner/5736ad119398decbe538bbbcbd1b8978#gistcomment-2750802

mwallner commented 5 years ago

I think $env:PSModulePath is really treated separately from other env vars. maybe it'd be a temporary workaround to have Chocolatey (actually Update-SessionEnvironment ) check if it's value is "valid" and "fix" it - if it doesn't contain the machine environemnt.

mwallner commented 5 years ago

seems to be a "known" problem: https://github.com/SharePoint/PnP-PowerShell/issues/37#issuecomment-150393027