raspberrypi / rpi-imager

The home of Raspberry Pi Imager, a user-friendly tool for creating bootable media for Raspberry Pi devices.
https://www.raspberrypi.com/software
Other
1.55k stars 233 forks source link

Fix timeout on systems with large numbers of loops #872

Closed waveform80 closed 1 month ago

waveform80 commented 1 month ago

On systems with a very large number of snap packages installed (for example), there are a considerable number of loop devices. In this case, the lsblk command in linuxdrivelist fills the stdout pipe, blocks, and the rpi-imager process assumes it has timed out 1.

This PR is a trivial work-around that simply excludes loop devices (major=7) from the lsblk output. Given subsequent code 2 excludes everything starting with /dev/loop anyway, there should be no change in user experience with this exclusion other than rpi-imager not failing on such systems (with a very large number of loop devices).

However, it may be preferable to fix the call to lsblk so that it doesn't (potentially) block the main UI thread for up to two seconds and die when the lsblk output is sufficiently large to block the pipe (i.e. have it loop on QProcess.waitForReadyRead instead of just blocking on QProcess.waitForFinished). If this is preferred, I'm happy to submit a PR for that instead.

maxnet commented 1 month ago

In this case, the lsblk command in linuxdrivelist fills the stdout pipe

That isn't supposed to be happening as QProcess reads stuff to its own buffers that grow very well.

However, it may be preferable to fix the call to lsblk so that it doesn't (potentially) block the main UI thread

Code isn't run from main thread. (method that calls waitForFinished() is run from DriveListModelPollThread)

kenvandine commented 1 month ago

I don't think it actually blocks the UI, but it never populates the drive list.

tdewey-rpi commented 1 month ago

Happy with the workaround, thanks for the submission @waveform80

waveform80 commented 1 month ago

In this case, the lsblk command in linuxdrivelist fills the stdout pipe

That isn't supposed to be happening as QProcess reads stuff to its own buffers that grow very well.

I was quite surprised at that; I'd thought the calling process needed to respond to readyRead in order for QProcess to continue reading from the process; i.e. that it would only fill a single limited-size buffer without intervention, but a quick experiment confirmed you're absolutely right (read from a little Python script that spews out infinite garbage, and QProcess merrily eats gigabytes of memory while the calling process twiddles its thumbs!).

Anyway, thanks for merging that!

maxnet commented 1 month ago

Happy with the workaround

While this patch does not hurt to have, at the same time do note that it is unlikely to solve the actual problem reported.

Hint: we already print a debug warning message if lsblk takes longer than 1 second to complete. But do not actually kill the process if it is less than 2 seconds. And even if killed, it retries a second later.

As you can see in kenvandine's debug output in the other issue, it is indeed true that lsblk got killed 4 times due to timeouts, but it did also run at least 6 times successfully. If the disk did not appear at all even briefly, there is some other issue as well. Be it that the security restrictions imposed by snap hide the drive (note that he is not running stock rpi-imager), or that there is another problem. E.g. that the drive is not seen as removable by lsblk and therefore is hidden by Imager.