Azure / iotedge

The IoT Edge OSS project
MIT License
1.45k stars 458 forks source link

Cannot generate support bundle #6934

Open mullerju opened 1 year ago

mullerju commented 1 year ago

Expected Behavior

Support Team must have the ability to generate support bundle files via Direct Method (portal) in case of a crash of IOT modules without impacting existing workloads The generation of a support bundle should not generate a reboot or crash of the modules, only a slowing down of some workloads should be allowed.

Current Behavior

Cannot generate support bundle. When asking the edge client to generate a support bundle, the process hangs indefinitely.

Steps to Reproduce

Ticket 2301090040007779

Context (Environment)

Device (host) operating system : Redhat 8.2 Architecture : AMD64 (x86-64)

Runtime Versions

nyanzebra commented 1 year ago

@mullerju from the ticket referenced with @vipeller it appears that the support bundle expects to be able to collect some data that it is never able to do and is getting stuck?

From a cursory look, it appears both in 1.2 and 1.4 there is a timeout on runtime checks (i.e. checking modules). Just to confirm from what is mentioned in the ticket, the support-bundle command works on 1.2 and doesn't on 1.4?

nyanzebra commented 1 year ago

@mullerju,

This does seem like a legitimate bug, we will look into adding a timeout to make sure support-bundle will not hang.

Also, did the possible mitigation technique mentioned by @vipeller help?

echo '127.0.0.1 aka.ms' | sudo tee -a /etc/hosts

it works the following way: the checker want to reach out to aka.ms/some_path to read the latest version. The line above, instead of going to a DNS server to resolve the address for aka.ms, immediately resolves it to localhost. So the next step, the checker will connect to localhost (at port 443) and tries to read the version file, which most probably fails, so the checker just goes to the next step.

After they have the support bundle, they need to remove that line from /etc/hosts, or if they never use aka.ms from that machine, they can keep it there for the next support bundle. But in this case the check/support-bundle commands will never be able to get the latest version number (which does not work anyway, if I get it right)
jlian commented 1 year ago

Fix is merged via https://github.com/Azure/iotedge/pull/6937

Likely be taken for the next release, ETA soon

ghost commented 1 year ago

Thank you

github-actions[bot] commented 1 year ago

This issue is being marked as stale because it has been open for 30 days with no activity.