Closed masnax closed 5 months ago
@MusicDin might this be related to the other lxc exec lockup issues?
@simondeziel Looks like the same LXD version mismatch happening over here. I guess --cohort=+
isn't working?
but ideally we should address this in LXD
Did you open a bug already for this?
but ideally we should address this in LXD
Did you open a bug already for this?
I haven't yet, thanks for the reminder.
The error message The joining server version doesn't match (expected 5.21.1 with API count 386)
could be a bit more informative if it included the version/API it got that didn't match.
Speaking of cohort, this reminds me there is this unexpected (to me) refresh of LXD. I'm not seeing why a refresh would be needed there and also without a channel being specified.
@tomponline The corresponding issue is here: https://github.com/canonical/lxd/issues/13425
@simondeziel Yeah that refresh looks redundant.
@simondeziel Yeah that refresh looks redundant.
Good, can you drop it in this bandaid PR? If not, I'm happy to do it in a separated one.
@simondeziel Yeah that refresh looks redundant.
Good, can you drop it in this bandaid PR? If not, I'm happy to do it in a separated one.
I added it to #300
Closing as I've narrowed this down to the specific containers I set up back in September. Not sure what's up with the container, but we don't need this PR anymore.
If the test suite is run with SNAPSHOT_RESTORE=0 and CONCURRENT_SETUP=1 then
lxc exec
can occasionally get stuck and never return if the command being executed is fast enough.This adds a workaround by just sleeping for 1s before such commands, but ideally we should address this in LXD.