fleetdm / fleet

Open device management
https://fleetdm.com
Other
5.22k stars 611 forks source link

fleetd unresponsive on Windows #30217

Closed ksatter closed 1 day ago

ksatter commented 2 weeks ago

Fleet version: <!-- Copy this from the "My account" page in the Fleet UI, or run fleetctl --version --> fleetd version: Began on v 1.41.0, persists on v1.43.0 Host OS: Windows 10 and 11 Pro

Web browser and operating system:


💥  Actual behavior

In Fleet, it was observed that certain Windows hosts were 'online', but had not fetched for an extended period.

The customer attempted to restart the fleet-osquery service. After this, hosts remained offline in Fleet.

The customer next rebooted the machine. After this, communication was re-established for a period, but eventually the host reverted to being 'online', but not refetching. This may coincide with the test host going into a sleep state and then waking again, but it is uncertain.

Observations from the orbit-osquery logs on these hosts:

  1. Frequent errors reaching the TUF server. orbit config and device token endpoints:
2025-06-18T02:43:41-04:00 INF update failed error="update metadata: client update: tuf: failed to download 8.root.json: Get \"https://updates.fleetdm.com/8.root.json\": read tcp [...]: wsarecv: An existing connection was forcibly closed by the remote host."

2025-06-10T01:40:43-04:00 INF network error error="POST /api/fleet/orbit/config: Post \"https://[...]/api/fleet/orbit/config\": dial tcp: lookup [...]: no such host"

2025-05-16T10:57:05-04:00 INF network error error="POST /api/fleet/orbit/config: Post \"https://{...}/api/fleet/orbit/config\": dial tcp: lookup [...]: no such host"

Full orbit-osquery logs, as well as Windows Event logs are available in Unthread.

I have not been able to reproduce this behavior, and do not currently see any patterns of behavior that explain why some hosts are unable to reach Fleet.

🧑‍💻  Steps to reproduce

  1. TODO
  2. TODO

🕯️ More info (optional)

N/A

Sampfluger88 commented 2 weeks ago

Linked to Unthread ticket:

Issue with remotely managing Fleet agent on Windows #6578

sharon-fdm commented 2 weeks ago

@xpkoala, this seems hard to reproduce. Perhaps we should allocate time for an engineer to look at the log. TMWYT

AndreyKizimenko commented 1 day ago

Based on the Unthread conversation, the issue appears resolved by the reinstall. We'll keep an eye on any future reoccurances, but for now, closing this issue.

fleet-release commented 1 day ago

In the glass city's glow, Fleet finds a steady rhythm, Windows hosts now flow.