canonical / action-build

A Github action for building Snapcraft projects
MIT License
42 stars 22 forks source link

Action never completes after successfully snapping #44

Open nicolasbock opened 2 years ago

nicolasbock commented 2 years ago

Hi,

After successfully completing and snapping the action gets stuck (for a lack of a better word) and is eventually timed out. This looks like the snapcraft process is waiting on something it is not seeing.

The latest build that exhibits this behavior can be found here:

https://github.com/nicolasbock/rabbitmq-server-snap/actions/runs/3177885875/jobs/5178819513#step:3:15178

Nick

sergiusens commented 2 years ago

Does it build fine without the action?

nicolasbock commented 2 years ago

Yes, locally I can build the snap in a VM or using LXD without such issues.

ilya-fedin commented 1 year ago

Having the same (kind of, logs stop in the middle of the build) problem at https://github.com/telegramdesktop/tdesktop after GitHub runners have updated 20230206.1 -> 20230217.1. The repo has action calling snapcraft manually, but here I tried to switch to action-build, sadly it didn't help.

sergiusens commented 1 year ago

Hi @ilya-fedin, I took a quick look and this looks more like running out of memory or the build taking too long.

The original issue was because of docker being provisioned on 22.04 which caused snapcraft to stall and do nothing as it did not detect any network. This was fixed in the action with some iptables rules.

ilya-fedin commented 1 year ago

@sergiusens if it runs out of memory, shouldn't the action stop? As for build taking too long, it was taking around 3 hours with 20230206.1 runner and now since 20230217.1 it times out after 6 hours, I don't really believe something can slow down it that much, it more looks like the connection with lxd container hangs after some time.

sergiusens commented 1 year ago

@ilya-fedin I found that explained on the runner-images repo https://github.com/actions/runner-images/issues/1918

There's a way to get a login shell into the runner to figure out what's happening, I don't have an instruction set handy, but I am certain @mr-cal does

sergiusens commented 1 year ago

I also have to walk back my comments on what this issue was about after reading the title and I cannot see the original linked log anymore :-(

nicolasbock commented 1 year ago

The original issue was because of docker being provisioned on 22.04 which caused snapcraft to stall and do nothing as it did not detect any network. This was fixed in the action with some iptables rules.

Could you point me to where this is fixed @sergiusens ? My workflows are running on 20.04, not 22.04.

Since the old logs are gone, I restarted that exact workflow here:

https://github.com/nicolasbock/rabbitmq-server-snap/actions/runs/4348662023

ilya-fedin commented 1 year ago

I found that explained on the runner-images repo actions/runner-images#1918

The description and comments make me feel like in that case this error was in actions not hitting timeout, but in my case the action seem to hit timeout (due to snapcraft hang)

изображение

nicolasbock commented 1 year ago

After successfully completing and snapping the action gets stuck (for a lack of a better word) and is eventually timed out. This looks like the snapcraft process is waiting on something it is not seeing.

@ilya-fedin no, the action times out in my case.

ilya-fedin commented 1 year ago

@nicolasbock I talk about the linked issue in the quote

nicolasbock commented 1 year ago

Ah, sorry, I misunderstood.

ilya-fedin commented 1 year ago

Downgrading action to Ubuntu 20.04 helps me