raspberrypi / rpi-imager

The home of Raspberry Pi Imager, a user-friendly tool for creating bootable media for Raspberry Pi devices.
https://www.raspberrypi.com/software
Other
1.61k stars 240 forks source link

[BUG] OS list download failures result in no device list, no warning #809

Open TheKnarf opened 7 months ago

TheKnarf commented 7 months ago

Describe the bug The dropdowns for Choose device and choose os does not seem to work.

To Reproduce Raspberry Pi Imager: v1.8.5 macOS: Apple M3 Pro - Sonoma 14.3

Expected behaviour For the app to work.

Screenshots or video Screenshot 2024-02-07 at 00 05 13

Screenshot 2024-02-07 at 00 05 04

Desktop (please complete the following information):

The name of the OS you are trying to write N/A

Are you using OS Customisation? N/A

Additional context N/A

TheKnarf commented 7 months ago

Screenshot 2024-02-07 at 00 11 09

Older version seems to work.

tdewey-rpi commented 7 months ago

Just to confirm - how have you installed 1.8.5 - did you use the dmg from our release page?

TheKnarf commented 7 months ago

Just to confirm - how have you installed 1.8.5 - did you use the dmg from our release page?

Actually I installed the .dmg from https://www.raspberrypi.com/software/.

Trying again today, by installing the newest .dmg from the Github release page and it seems to work...

Screenshot 2024-02-07 at 11 11 41

tdewey-rpi commented 7 months ago

The only way I can see to reproduce your failure is to introduce a network discontinuity during the OS list download.

In that case, we should probably do something better than we do today - so I'm going to keep this bug open, but slightly rewrite the title.

Do you know if you were moving from a dock or similar as you opened the application?

TheKnarf commented 7 months ago

Do you know if you were moving from a dock or similar as you opened the application? I don't think I did anything special.

barryoneill commented 6 months ago

Having the same problem (Macbook Pro 2021 M1 Max, Sonoma 14.3.1).

I've tried the latest version (1.8.5) and a few previous versions, but they all behave the same.

❯ ./rpi-imager --version
./rpi-imager version 1.8.5
Repository: https://downloads.raspberrypi.org/os_list_imagingutility_v4.json

running it

❯ ./rpi-imager
OSX most preferred language: "en_US"
qt.qpa.fonts: Populating font family aliases took 165 ms. Replace uses of missing font family "Roboto" with one that exists to avoid this cost.
qt.tlsbackend.ossl: Failed to load libssl/libcrypto.

I get the same as the first screenshot in this ticket.

If I try to explicitly add LD_LIBRARY_PATH to where I know libssl and libcrypto are, I get additional output, unsure if it's helpful:

❯ export LD_LIBRARY_PATH=/opt/homebrew/Cellar/openssl@3/3.2.0_1/lib
❯ ./rpi-imager
OSX most preferred language: "en_US"
qt.qpa.fonts: Populating font family aliases took 167 ms. Replace uses of missing font family "Roboto" with one that exists to avoid this cost.
qt.tlsbackend.ossl: Failed to load libssl/libcrypto.
2024-02-23 11:08:30.964 rpi-imager[65725:6155371] WARNING: Secure coding is automatically enabled for restorable state! However, not on all supported macOS versions of this application. Opt-in to secure coding explicitly by implementing NSApplicationDelegate.applicationSupportsSecureRestorableState:.

I have also tried enabling and disabling the 'run with rosetta' option, in case it had something to do with intel/mac compatibility.

maxnet commented 6 months ago

You always get the "qt.tlsbackend.ossl: Failed to load libssl/libcrypto." so that is not the problem. libssl is an option Qt can use, but it has more options...

If you wait for a minute, does it give any other output lines? As I do have access to one device that gives a "Host downloads.raspberrypi.org not found" error after you wait a while. But that is a company provided Macbook, with extra "security" software, so that is to be expected...

barryoneill commented 6 months ago

Ok, if I let it run, and after 6 minutes, it worked. After some investigation, I have settled on the fact that it's an ipv6 issue, for if I switch to a terminal and do:

❯ wget https://downloads.raspberrypi.org/os_list_imagingutility_v4.json
--2024-02-23 12:42:48--  https://downloads.raspberrypi.org/os_list_imagingutility_v4.json
Resolving downloads.raspberrypi.org (downloads.raspberrypi.org)... 2a00:1098:82:47::2:1, 2a00:1098:82:47::1, 2a00:1098:84:1e0::1, ...
Connecting to downloads.raspberrypi.org (downloads.raspberrypi.org)|2a00:1098:82:47::2:1|:443... failed: Operation timed out.
Connecting to downloads.raspberrypi.org (downloads.raspberrypi.org)|2a00:1098:82:47::1|:443... failed: Operation timed out.
Connecting to downloads.raspberrypi.org (downloads.raspberrypi.org)|2a00:1098:84:1e0::1|:443... failed: Operation timed out.
Connecting to downloads.raspberrypi.org (downloads.raspberrypi.org)|2a00:1098:84:1e0::2|:443... failed: Operation timed out.
Connecting to downloads.raspberrypi.org (downloads.raspberrypi.org)|2a00:1098:82:47::1:1|:443... failed: Operation timed out.
^C

❯ wget --inet4-only https://downloads.raspberrypi.org/os_list_imagingutility_v4.json
--2024-02-23 12:57:29--  https://downloads.raspberrypi.org/os_list_imagingutility_v4.json
Resolving downloads.raspberrypi.org (downloads.raspberrypi.org)... 93.93.135.117, 46.235.227.39, 93.93.135.141, ...
Connecting to downloads.raspberrypi.org (downloads.raspberrypi.org)|93.93.135.117|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 116556 (114K) [application/json]
Saving to: ‘os_list_imagingutility_v4.json.5’

os_list_imagingutility_v4.json.5                100%[=======================================================================================================>] 113.82K   588KB/s    in 0.2s

2024-02-23 12:57:29 (588 KB/s) - ‘os_list_imagingutility_v4.json.5’ saved [116556/116556]

It would appear that ipv6 is not functioning on the network I'm on, so it's a problem on my end. Thanks!

maxnet commented 6 months ago

Yeah, that's a bit of a problem on RPI's side as well.

$ host downloads.raspberrypi.org
downloads.raspberrypi.org is an alias for lb.raspberrypi.org.
lb.raspberrypi.org is an alias for lb.raspberrypi.com.
lb.raspberrypi.com has address 93.93.135.118
lb.raspberrypi.com has address 46.235.231.145
lb.raspberrypi.com has address 46.235.227.39
lb.raspberrypi.com has address 176.126.240.86
lb.raspberrypi.com has address 176.126.240.167
lb.raspberrypi.com has address 176.126.240.84
lb.raspberrypi.com has address 46.235.231.111
lb.raspberrypi.com has address 93.93.135.117
lb.raspberrypi.com has address 46.235.231.151
lb.raspberrypi.com has address 46.235.230.122
lb.raspberrypi.com has address 93.93.135.141
lb.raspberrypi.com has address 93.93.130.212
lb.raspberrypi.com has IPv6 address 2a00:1098:88:26::1:1
lb.raspberrypi.com has IPv6 address 2a00:1098:88:26::1
lb.raspberrypi.com has IPv6 address 2a00:1098:80:56::1:1
lb.raspberrypi.com has IPv6 address 2a00:1098:80:56::3:1
lb.raspberrypi.com has IPv6 address 2a00:1098:82:47::1
lb.raspberrypi.com has IPv6 address 2a00:1098:88:26::2:1
lb.raspberrypi.com has IPv6 address 2a00:1098:82:47::1:1
lb.raspberrypi.com has IPv6 address 2a00:1098:84:1e0::1
lb.raspberrypi.com has IPv6 address 2a00:1098:84:1e0::2
lb.raspberrypi.com has IPv6 address 2a00:1098:84:1e0::3
lb.raspberrypi.com has IPv6 address 2a00:1098:80:56::2:1
lb.raspberrypi.com has IPv6 address 2a00:1098:82:47::2:1

If the client THINKS it has IPv6 connecitivy (as in it does have an IPv6 address assigned, but it is not actually working), it will try out ALL the IPv6 addresses first before falling back to IPv4.

So it will have to time out 12 times first(!), in this case...

barryoneill commented 6 months ago

A little too much redundancy :)

tdewey-rpi commented 6 months ago

Thanks for the debug, @maxnet

I'm not sure this adds up, though. Qt implemented the Happy Eyeballs algorithm with Qt5, so I might have expected QNetworkAccessManager or equivalent to follow that host selection mechanism: https://codereview.qt-project.org/c/qt/qtbase/+/1003

Failing in this manner suggests our HTTP client isn't actually using the HE algorithm, which would almost certainly be an exotic bug that @barryoneill has identified (good catch!) - assuming that v4 connectivity was OK.

lurch commented 6 months ago

the Happy Eyeballs algorithm

Best algorithm name ever! :joy:

barryoneill commented 6 months ago

Failing in this manner suggests our HTTP client isn't actually using the HE algorithm, which would almost certainly be an exotic bug that @barryoneill has identified (good catch!) - assuming that v4 connectivity was OK.

Confirmed that v4 connectivity is virtually instant for me.

tdewey-rpi commented 6 months ago

Unfortunately this looks fairly intractable. I took an hour to dive in to how they've implemented Happy Eyeballs, and noticed two things:

  1. It's incredibly complicated, spanning across multiple Qt classes.
  2. There appears to be a hard-coded 300 second delay on attempting IPv4 connections at all - and some logic around stopping that timer that I wasn't able to get a clear understanding of in the time I allocated.

Essentially it looks like in cases where IPv6 routing is subtly broken, Qt will enforce a ~5 minute delay. Which actually correlates with @barryoneill's sleuthing above.

I'm going to mark this bug as something to look at for 2.0. I'm not happy at all with how Qt has implemented this.

lurch commented 6 months ago

Sad Eyeballs :eyes: :cry:

Karim9833 commented 4 months ago

Had the same issue it seems like if you have ipv6 enabled in router settings it causes this bug. I've turned off ipv6 in my settings and it worked right after.

tdewey-rpi commented 3 months ago

@Karim9833 Glad that resolved the problem for you. Unfortunately, flakey IPv6 support appears to be more harmful than no IPv6 support - at least as far as Qt applications are concerned.

maxnet commented 3 months ago

@Karim9833 Glad that resolved the problem for you. Unfortunately, flakey IPv6 support appears to be more harmful than no IPv6 support - at least as far as Qt applications are concerned.

Note that delayedConnectionTimer is a QTimer. QTimer intervals are in milliseconds not seconds. Giving IPv6 a 0.3 second head start is pretty reasonable...

So suggest you double check if Qt is really to blame ;-) And not some other component. (E.g. could also be a certificate revocation check done by the SSL library holding things up. In which case the problem may not show up for all users with flaky IPv6 as which library is used differs by platform).

tdewey-rpi commented 3 months ago

I concluded investigations in this area ~3 months ago, at which point the lack of flexibility in Qt's networking implementation (specifically, in terms of offering a user the ability to force IPv4 for their situation) and the baroque form of the implementation caused me to consider a more conventional alternative - libcurl, which has a much stronger claim for cross-platform support, testing and ongoing attention.

I could of course be convinced that this is the wrong direction - but the proposed alternative would have to show a better track record of support than the one offered by the curl project.

maxnet commented 3 months ago

libcurl also has write-once-debug-everywhere problems. Depending on your platform, Linux distribution and phase of the moon, it could be outsourcing SSL to 8 (!) different SSL libraries.. ( https://curl.se/docs/ssl-compared.html ) And may give different error codes if things go wrong.

tdewey-rpi commented 3 months ago

libcurl also has write-once-debug-everywhere problems. Depending on your platform, Linux distribution and phase of the moon, it could be outsourcing SSL to 8 (!) different SSL libraries.

As does Qt - so the evidence bar has not been cleared.