fleetdm / fleet

Open-source platform for IT, security, and infrastructure teams. (Linux, macOS, Chrome, Windows, cloud, data center)
https://fleetdm.com
Other
3.12k stars 431 forks source link

Software Package installs for Windows .exe and .msi installers stuck in Pending state #22558

Closed PezHub closed 1 month ago

PezHub commented 1 month ago

Fleet version: 4.57

Web browser and operating system: ANY


💥  Actual behavior

When attempting to install software on a windows host, the install gets stuck in a pending state and never completes.

🧑‍💻  Steps to reproduce

  1. Upload an exe or msi installer to a team
  2. Deploy that software to a windows host
  3. Observe the install remains in a pending state

🕯️ More info (optional)

This occurs via Self Service or manual installations and can be reproduced with both exe and msi installers Screenshot 2024-10-01 at 1 48 07 PM

lucasmrod commented 1 month ago

@PezHub What version (commit) of fleetd are you using? I can see Agent 42.

Have you reproduced with stable (1.33.0)?

PezHub commented 1 month ago

yes, my initial testing was with 1.33.0. Poor choice of screenshot on my part

PezHub commented 1 month ago

Additional testing shows that installers uploaded prior to migrating to 4.57 succeed. Only new uploads once on 4.57+ get stuck in the pending state.

Test Results ver4.56.0 = ✅ Uploaded the following installers: Adobe Evernote - installed OK Firefox Google Chrome.msi NordVPN Slack - installed OK Zoom

Migrated to ver4.57.0 Existing installers: Firefox - installed OK Evernote - reinstalled OK Google Chrome.msi - installed OK Zoom - installed OK

Uploaded new installer- VSCode.exe - stuck in pending

Migrated to ver4.57.1 Google MSI - reinstalled OK NordVPN - installed OK

Uploaded new installer - SublimeText.exe - stuck in pending

Additiona Notes:

install script for the same package is different upon upload for 4.56 vs 4.57. example for evernote.exe

4.56.0

$exeFilePath = "${env:INSTALLER_PATH}"

# extract the name of the executable to use as the sub-directory name
$exeName = [System.IO.Path]::GetFileName($exeFilePath)
$subDir = [System.IO.Path]::GetFileNameWithoutExtension($exeFilePath)

$destinationPath = Join-Path -Path $env:ProgramFiles -ChildPath $subDir

# check if the directory does not exist, and create it if necessary
if (-not (Test-Path -Path $destinationPath)) {
    New-Item -ItemType Directory -Path $destinationPath
}

# copy the .exe file to the new sub-directory
$destinationExePath = Join-Path -Path $destinationPath -ChildPath $exeName
Copy-Item -Path $exeFilePath -Destination $destinationExePath

4.57.0

# Learn more about .exe install scripts:
# http://fleetdm.com/learn-more-about/exe-install-scripts

$exeFilePath = "${env:INSTALLER_PATH}"

try {

# Add argument to install silently
# Argument to make install silent depends on installer,
# each installer might use different argument (usually it's "/S" or "/s")
$processOptions = @{
  FilePath = "$exeFilePath"
  ArgumentList = "/S"
  PassThru = $true
  Wait = $true
}

# Start process and track exit code
$process = Start-Process @processOptions
$exitCode = $process.ExitCode

# Prints the exit code
Write-Host "Install exit code: $exitCode"
Exit $exitCode

} catch {
  Write-Host "Error: $_"
  Exit 1
}
lucasmrod commented 1 month ago

IIRC exe install script is not supposed to work with every exe installer. Each installer may need some tweaking of the the default installer script.

/cc @getvictor

lucasmrod commented 1 month ago

Related comment: https://github.com/fleetdm/fleet/issues/20000#issuecomment-2217788241

roperzh commented 1 month ago

It might be a red herring. The script should either fail or succeed based on how the script contents relate to the installer itself, getting stuck in pending is not expected.

iansltx commented 1 month ago

Red herring sounds right, as MSIs are broken as well, and those have a reliable install flow.

I did a little more investigation last night in the Slack thread, but didn't see anything unique about the software installer record.

mna commented 1 month ago

I tested with the Firefox Installer (a .exe) on 4.57.1 and it worked. I'll try again with the ones know to fail in Gabe's tests.

image

mna commented 1 month ago

I managed to install VSCode by fixing the install script to this:

# Learn more about .exe install scripts:
# http://fleetdm.com/learn-more-about/exe-install-scripts

$exeFilePath = "${env:INSTALLER_PATH}"

try {

# Add argument to install silently
# Argument to make install silent depends on installer,
# each installer might use different argument (usually it's "/S" or "/s")
$processOptions = @{
  FilePath = "$exeFilePath"
  ArgumentList = "/VERYSILENT /MERGETASKS=!runcode"
  PassThru = $true
  Wait = $true
}

# Start process and track exit code
$process = Start-Process @processOptions
$exitCode = $process.ExitCode

# Prints the exit code
Write-Host "Install exit code: $exitCode"
Exit $exitCode

} catch {
  Write-Host "Error: $_"
  Exit 1
}

Note the ArgumentList line: ArgumentList = "/VERYSILENT /MERGETASKS=!runcode". I believe the important part is the /MERGETASKS=!runcode, this prevents the installer from starting the program at the end, which make it seem like the install process never returns, and thus hangs/never completes. Found this here. Obviously this is installer-specific.

image

mna commented 1 month ago

I tested with the Slack installer (the one used in the QA wolf ticket that was a duplicate of this one) and same issue, it does launch Slack at the end of install and this makes it seem to Fleet as if it never completes installation (that it is still going on). I couldn't google the option to prevent launching after install, so I could not confirm a set of args to provide to it that would work, but it's the same root issue.

getvictor commented 1 month ago

The pending issue is known already: https://github.com/fleetdm/fleet/issues/22155

Many EXE installers will fail when using default script. Some MSI installers will also fail.

A few weeks ago, I proposed having a library of known/tested installers/scripts so that engineers/CS/customers can refer to them. We decided not to do that since we ultimately want to have a Fleet-managed apps library.

mna commented 1 month ago

Following convo on Slack : https://fleetdm.slack.com/archives/C03C41L5YEL/p1727894346150629?thread_ts=1727884144.284709&cid=C03C41L5YEL

I will implement the timeout for all scripts related to software installation (initial value will be 1h). So the fix will also address https://github.com/fleetdm/fleet/issues/22155.

mna commented 1 month ago

Manual QA with my Windows maching and running the fleetd with the fix (via local TUF):


As usual and expected, the VSCode installer hangs (using the default install script): image


Install is shown as upcoming activity:

image


After the timeout (I modified it to 5m for my test, but would be 1h):

image

So this appears to work as expected - the most important point being that it eventually unblocks the software installation queue if one install hangs.

@georgekarrv @PezHub

mna commented 1 month ago

I also tested a queue of one that fails with timeout (Slack) followed by one that is expected to succeed (firefox) and it worked as expected - the first timed out after ~5m (my modified fleetd), and then the next one proceeded and succeeded.

image

mna commented 1 month ago

For QA, I created this branch / draft PR (NOT TO BE MERGED) with a shorter timeout of 5m so that we don't have to wait forever to test this. Gabe already knows about it but fyi @georgekarrv

Branch: 22558-shorter-timeout-windows-installer-for-qa Draft PR: https://github.com/fleetdm/fleet/pull/22596

PezHub commented 1 month ago

QA Notes: Testing on the temporary branch proved successful with the timeout limit set to 5min. I tested 3 known "failing" .exe installers (VSCode, Slack, & NordVPN) running thru similar workflows as Martin above.

This was all via local TUF so I'll wait for fleetd changes to get released to stable then test again with the final branch set with a 1hr timeout before passing. Screenshot 2024-10-02 at 10 30 10 PM

georgekarrv commented 1 month ago

It's on edge already, if you update your channel to edge you can test now

PezHub commented 1 month ago

I was able to test again off main using the edge channel and can confirm after 60min VSCode.exe failed and the FIrefox.exe install I had enqueued succeeded. Screenshot 2024-10-03 at 1 21 14 PM

fleet-release commented 1 month ago

Windows install waits, But with fix, programs take flight, Ease in users' sights.