balena-os / balena-supervisor

Balena Supervisor: balena's agent on devices.
https://balena.io
Other
148 stars 63 forks source link

Supervisor fails when preloaded device attempts to pin after being moved to a different app #1105

Open roman-mazur opened 5 years ago

roman-mazur commented 5 years ago

After a device with preloaded containers is moved to a different app before it boots, the supervisor fails with the following log:

Oct 01 17:00:11 b928ea6 resin-supervisor[1792]: Event: Device bootstrap {}
Oct 01 17:00:11 b928ea6 resin-supervisor[1792]: Attempting to pin device to preloaded release...
Oct 01 17:00:12 b928ea6 resin-supervisor[1792]: Could not pin device to release!
Oct 01 17:00:12 b928ea6 resin-supervisor[1792]: Error:  Error: Cannot continue pinning preloaded device! No release found!
Oct 01 17:00:12 b928ea6 resin-supervisor[1792]:     at e.<anonymous> (/usr/src/app/dist/app.js:409:25954)
Oct 01 17:00:12 b928ea6 resin-supervisor[1792]:     at /usr/src/app/dist/app.js:409:13720
Oct 01 17:00:12 b928ea6 resin-supervisor[1792]:     at Object.next (/usr/src/app/dist/app.js:409:13825)
Oct 01 17:00:12 b928ea6 resin-supervisor[1792]:     at a (/usr/src/app/dist/app.js:409:12571)
Oct 01 17:00:12 b928ea6 resin-supervisor[1792]:     at process._tickCallback (internal/process/next_tick.js:68:7)
Oct 01 17:00:12 b928ea6 resin-supervisor[1792]: Event: Device bootstrap failed, retrying {"delay":30000,"error":{"message":"Cannot continue pinning preloaded device! No release found!","stack":"Error: Cannot cont>

Workaround included:

See https://github.com/curcuz/batch-supervisor-restart/blob/master/task.sh#L22

balena-ci commented 5 years ago

[roman-mazur] This issue has attached support thread https://jel.ly.fish/#/support-thread~277e09f0-b4a7-4779-b30a-747179955703

CameronDiver commented 5 years ago

I've clarified a little bit, this problem exists because the supervisor attempts to pin itself to the previous application after being moved.

roman-mazur commented 5 years ago

@CameronDiver should we track the cloud API issue separately? After devices were moved to a different app, they still referred to the previous app releases.

CameronDiver commented 4 years ago

I'm not 100% on what you mean @roman-mazur ? Which specific cloud issue?

roman-mazur commented 4 years ago

@CameronDiver after moving a device from one application to another, the device release pins were referring to a release from the old app - this sounds like a bug in the API impl. We had to "re-pin" the devices first, and then clean the supervisor state to make it fetch new releases info.

CameronDiver commented 4 years ago

This doesn't really seem like an API issue. The supervisor has to specify a release ID when pinning a device, if that release ID is wrong, then there's nothing really the API can do.