guysoft / CustomPiOS

A Raspberry Pi and other ARM devices distribution builder
GNU General Public License v3.0
509 stars 146 forks source link

Intermittent loop errors #80

Closed myoung34 closed 4 years ago

myoung34 commented 4 years ago

I'm attempting to build something based on the octopi layout.

This build worked, but I wanted to remove the if on the workflow: https://github.com/myoung34/tilty-pi/actions/runs/187201484

which caused this failure despite no changes: https://github.com/myoung34/tilty-pi/runs/923707874

Resizing file system to 1620 MB...
++++ awk '{print $4-0}'
++++ sfdisk -d 2020-05-27-raspios-buster-lite-armhf.img
++++ grep 2020-05-27-raspios-buster-lite-armhf.img2
+++ start=532480
+++ offset=272629760
+++ detach_all_loopback 2020-05-27-raspios-buster-lite-armhf.img
++++ losetup
++++ awk '{ print $1 }'
++++ grep 2020-05-27-raspios-buster-lite-armhf.img
+++ for img in $(losetup  | grep $1 | awk '{ print $1 }' )
+++ losetup -d /dev/loop2
+ exit 1
losetup: /dev/loop2: detach failed: No such device or address
##[error]Process completed with exit code 1.

which seems to be from https://github.com/guysoft/CustomPiOS/blob/master/src/common.sh#L143

other intermittent failures:

Hit re-run enough times and finally there's a success with no changes: https://github.com/myoung34/tilty-pi/runs/924379003

guysoft commented 4 years ago

I am not sure, perhaps its related to: https://github.com/guysoft/CustomPiOS/issues/55

If you re-run and it doesn't always succeeds it sounds like that. It might be a fluke. You actually might want to run it again, nightly and see if that keeps happening. It does not in our other builds.| The github actions build is still pretty new to the project. We haven't started releasing using it, we have a nightly build server still to see how stable it is (No fail so far on OctoPi, but thats just one point).

myoung34 commented 4 years ago

I havent seen this happen in a while, going to close for now

Salamandar commented 3 years ago

Happened to me on Github-CI (github actions). Can't get it to work :'(

Resizing file system to 1880 MB...
++++ awk '{print $4-0}'
++++ grep 2020-02-13-raspbian-buster-lite.img2
++++ sfdisk -d 2020-02-13-raspbian-buster-lite.img
+++ start=532480
+++ offset=272629760
+++ detach_all_loopback 2020-02-13-raspbian-buster-lite.img
++++ losetup
++++ grep 2020-02-13-raspbian-buster-lite.img
++++ awk '{ print $1 }'
+++ for img in $(losetup  | grep $1 | awk '{ print $1 }' )
+++ losetup -d /dev/loop0
+ exit 1
losetup: /dev/loop0: detach failed: No such device or address
Error: Process completed with exit code 1.

https://github.com/Salamandar/StrangerFamily/runs/1871119298?check_suite_focus=true

myoung34 commented 3 years ago

@Salamandar it seems to be very very inconsistent. It eventually started working 99% of the time with that occasionally happening, but when I first started writing the image it happened 99% of the time and only worked a few times.

I honestly dont know what it is, but it eventually just started working. I think maybe theres a race condition on when /dev/loop0 comes up and sometimes its not there when the command tries to -d it

Salamandar commented 3 years ago

@myoung34 Erf... I'm trying to build using ubuntu-20.04 instead of ubuntu-latest, maybe that will "solve" the issue for me.

guysoft commented 3 years ago

Might also be docker versions

Salamandar commented 3 years ago

Unfortunately that did not help... Only the loop device number is different (here it's loop10). @guysoft Could you please add a ||true to the losetup -d command ? Or at least a if [[ -f $img ]]; then losetup -d $img; fi.

myoung34 commented 3 years ago

I think better would be to try to use bash until with a max iteration/timeout to see if its a race condition that can be caught within a time limit (say 1-5min)

guysoft commented 3 years ago

@myoung34 I am not sure if it would work if it fails on the first try, but it might be worth to test it. But for now I rarely see it.

Salamandar commented 3 years ago

@guysoft I can try it, just tell me which branch to point my submodule to ;)