canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 930 forks source link

[Upgrade from mixed-storage LXD] LXD won't restart after upgrade to 2.10.1 #3026

Closed sbworth closed 7 years ago

sbworth commented 7 years ago

Hello,

I believe that I have a variant of the problem seen in issue #3024 which I have been following with interest. After upgrade to 2.10.1 from 2.8.x, lxd cannot start up.

Required information

Issue description

I have two systems, sys1 and sys2. Sys1 is using dir storage, while sys2 is using LVM.

With sys1, I migrated from 2.8.x to 2.9.x and then to 2.10.x. After resolving an issue with a change in profile inheritance of the disk device after the 2.9.x upgrade, sys1 seems to have upgraded to 2.10.x ok.

With sys2, I migrated directly from 2.8.x to 2.10.x. This was inadvertent, as I had just sorted out the 2.9.x issue on sys1 and intended to move sys2 to 2.9.x. When lxd attempted to restart, the lxc command line client stopped responding.

Checking /var/log/lxd/lxd.log, we see:

lvl=info msg="LXD 2.10.1 is starting in normal mode" path=/var/lib/lxd t=2017-03-06T14:34:02-0500
lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." t=2017-03-06T14:34:02-0500
lvl=info msg="Kernel uid/gid map:" t=2017-03-06T14:34:02-0500
lvl=info msg=" - u 0 0 4294967295" t=2017-03-06T14:34:02-0500
lvl=info msg=" - g 0 0 4294967295" t=2017-03-06T14:34:02-0500
lvl=info msg="Configured LXD uid/gid map:" t=2017-03-06T14:34:02-0500
lvl=info msg=" - u 0 100000 65536" t=2017-03-06T14:34:02-0500
lvl=info msg=" - g 0 100000 65536" t=2017-03-06T14:34:02-0500
lvl=warn msg="Database already contains a valid entry for the storage pool: lxd." t=2017-03-06T14:34:03-0500
lvl=warn msg="Storage volumes database already contains an entry for the container." t=2017-03-06T14:34:03-0500
lvl=info msg="LXD 2.10.1 is starting in normal mode" path=/var/lib/lxd t=2017-03-06T14:44:02-0500
lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." t=2017-03-06T14:44:02-0500
lvl=info msg="Kernel uid/gid map:" t=2017-03-06T14:44:02-0500
lvl=info msg=" - u 0 0 4294967295" t=2017-03-06T14:44:02-0500
lvl=info msg=" - g 0 0 4294967295" t=2017-03-06T14:44:02-0500
lvl=info msg="Configured LXD uid/gid map:" t=2017-03-06T14:44:02-0500
lvl=info msg=" - u 0 100000 65536" t=2017-03-06T14:44:02-0500
lvl=info msg=" - g 0 100000 65536" t=2017-03-06T14:44:02-0500
lvl=warn msg="Database already contains a valid entry for the storage pool: lxd." t=2017-03-06T14:44:03-0500
lvl=warn msg="Storage volumes database already contains an entry for the container." t=2017-03-06T14:44:03-0500

journalctl -u lxd

Mar 06 14:34:02 sys2 systemd[1]: Starting LXD - main daemon...
Mar 06 14:34:02 sys2 lxd[4416]: lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." t=2017-03-06T14:34:02-0500
Mar 06 14:34:03 sys2 lxd[4416]: lvl=warn msg="Database already contains a valid entry for the storage pool: lxd." t=2017-03-06T14:34:03-0500
Mar 06 14:34:03 sys2 lxd[4416]: lvl=warn msg="Storage volumes database already contains an entry for the container." t=2017-03-06T14:34:03-0500
Mar 06 14:34:13 sys2 lxd[4416]: error: device or resource busy
Mar 06 14:34:13 sys2 systemd[1]: lxd.service: Main process exited, code=exited, status=1/FAILURE
Mar 06 14:44:02 sys2 lxd[4417]: error: LXD still not running after 600s timeout.
Mar 06 14:44:02 sys2 systemd[1]: lxd.service: Control process exited, code=exited status=1
Mar 06 14:44:02 sys2 systemd[1]: Failed to start LXD - main daemon.
Mar 06 14:44:02 sys2 systemd[1]: lxd.service: Unit entered failed state.
Mar 06 14:44:02 sys2 systemd[1]: lxd.service: Failed with result 'exit-code'.
Mar 06 14:44:02 sys2 systemd[1]: lxd.service: Service hold-off time over, scheduling restart.
Mar 06 14:44:02 sys2 systemd[1]: Stopped LXD - main daemon.
Mar 06 14:44:02 sys2 systemd[1]: Starting LXD - main daemon...
Mar 06 14:44:02 sys2 lxd[8637]: lvl=warn msg="CGroup memory swap accounting is disabled, swap limits will be ignored." t=2017-03-06T14:44:02-0500
Mar 06 14:44:03 sys2 lxd[8637]: lvl=warn msg="Database already contains a valid entry for the storage pool: lxd." t=2017-03-06T14:44:03-0500
Mar 06 14:44:03 sys2 lxd[8637]: lvl=warn msg="Storage volumes database already contains an entry for the container." t=2017-03-06T14:44:03-0500
Mar 06 14:44:13 sys2 lxd[8637]: error: device or resource busy
Mar 06 14:44:13 sys2 systemd[1]: lxd.service: Main process exited, code=exited, status=1/FAILURE

Sample of /var/lib/lxd/containers:

drwx------+ 5 root   root    4096 Jan  2 15:59 astro3
lrwxrwxrwx  1 root   root      16 Jan  2 11:35 astro3.lv -> /dev/lxd/astro3
drwxr-xr-x+ 5 root   root    4096 Jan  9 12:13 vault
lrwxrwxrwx  1 root   root      14 Jan 11 15:01 vault.lv -> /dev/lxd/vault
drwx------  2 root   root    4096 Feb 15 13:48 vgate1
lrwxrwxrwx  1 root   root      15 Feb 15 13:48 vgate1.lv -> /dev/lxd/vgate1
drwxr-xr-x+ 5 root   root    4096 Jan 18 13:52 vpn1
lrwxrwxrwx  1 root   root      13 Jan 25 16:22 vpn1.lv -> /dev/lxd/vpn1

File tree listing of /var/lib/lxd/storage-pools:

.
└── lxd
    └── containers

That is, the storage-pools area is empty. (Were the container rootfs links supposed to be migrated to storage-pools?)

The images area seems untouched:

root@sol2:/var/lib/lxd/containers# ls /var/lib/lxd/images/
11fc1b1d39b9f9cd7e9491871f1421ac4278e1d599ecf5d180f2a6e2483bd172
11fc1b1d39b9f9cd7e9491871f1421ac4278e1d599ecf5d180f2a6e2483bd172.lv
11fc1b1d39b9f9cd7e9491871f1421ac4278e1d599ecf5d180f2a6e2483bd172.rootfs
18e7ed74d0d653894f65343afbc35b92c6781933c273943d882c36a5c5535533
18e7ed74d0d653894f65343afbc35b92c6781933c273943d882c36a5c5535533.lv
457a80ea4720900b69e5542cea5351f58021331bc96e773e4855a3e2ce1e6595
457a80ea4720900b69e5542cea5351f58021331bc96e773e4855a3e2ce1e6595.lv
457a80ea4720900b69e5542cea5351f58021331bc96e773e4855a3e2ce1e6595.rootfs
543e662b70958f5b87f68b20eb0a205d8c4b14c41f80699e9a98b3b851883d15
543e662b70958f5b87f68b20eb0a205d8c4b14c41f80699e9a98b3b851883d15.lv
543e662b70958f5b87f68b20eb0a205d8c4b14c41f80699e9a98b3b851883d15.rootfs
a570ce23e1dae791e7b8b2f2bcb98c1404273e97c7a1fb972bf0f5835ac3e869
a570ce23e1dae791e7b8b2f2bcb98c1404273e97c7a1fb972bf0f5835ac3e869.lv
b5b03165de7c450f5f9793c8b2eb4a364fbd81124a01511f854dd379eef52abb
b5b03165de7c450f5f9793c8b2eb4a364fbd81124a01511f854dd379eef52abb.rootfs
bfd17410a8c7fe6397dba3e353a23001243bc43af87acf25544d6b0ab624f9f8
bfd17410a8c7fe6397dba3e353a23001243bc43af87acf25544d6b0ab624f9f8.rootfs
d7c16c4fedd3308b5bffdb91f491b8458610c6115d37ace8ba4bcf5c29b23cc6
d7c16c4fedd3308b5bffdb91f491b8458610c6115d37ace8ba4bcf5c29b23cc6.lv
d7c16c4fedd3308b5bffdb91f491b8458610c6115d37ace8ba4bcf5c29b23cc6.rootfs
e12c3c1aed259ce62b4a5e8dc5fe8b92d14d36e611b3beae3f55c94df069eeed
e12c3c1aed259ce62b4a5e8dc5fe8b92d14d36e611b3beae3f55c94df069eeed.lv
ff52f536d2896f358bc913d592828ecf1b39fae45e4ee4825930091e8793ac28
ff52f536d2896f358bc913d592828ecf1b39fae45e4ee4825930091e8793ac28.rootfs

Output from pvs and vgs and -- highly edited for readability -- output from lvs:

  PV         VG   Fmt  Attr PSize   PFree  
  /dev/sda5  dat1 lvm2 a--  931.13g 181.13g
  /dev/sda6  lxd  lvm2 a--    2.56t      0 
  VG   #PV #LV #SN Attr   VSize   VFree  
  dat1   1   1   0 wz--n- 931.13g 181.13g
  lxd    1  42   0 wz--n-   2.56t      0 
 LV        VG   Attr       LSize   Pool   Origin   Data%  Meta%
 LXDPool  lxd  twi-aotz--   2.56t                   3.91   2.12
 astro3   lxd  Vwi-aotz--  10.00g LXDPool          20.69 
 vault    lxd  Vwi-aotz--  10.00g LXDPool          12.34
 vgate1   lxd  Vwi-a-tz-- 300.00g LXDPool           1.85
 vpn1     lxd  Vwi-aotz-- 300.00g LXDPool           1.88

Data from lxd.db:

sqlite> select * from storage_pools;
1|lxd|lvm
sqlite> select * from storage_pools_config;
166|1|volume.size|300GB
167|1|size|21GB
168|1|source|lxd
169|1|lvm.thinpool_name|LXDPool
170|1|lvm.vg_name|lxd
sqlite> select * from storage_volumes;
1|astro3|1|0
sqlite> select * from storage_volumes_config;
67|1|block.filesystem|ext4
68|1|size|300GB

It looks somewhat odd to me that host astro3 has an entry in the storage_volumes tables when nothing else does. It does differ in being a privileged container.

Any help you can provide to get regular access restored will be greatly appreciated. For the moment, the containers continue to provide their services. Let me know if I can provide any other useful data or perform any non-destructive tests.

sbworth commented 7 years ago

OK. Thanks for the clarification. I was hoping to resolve this tonight, but I have run out of gas. More tomorrow.

stgraber commented 7 years ago

Did you get to run the last binary I gave you? That should at least tell us what command is returning that error.

sbworth commented 7 years ago

I didn't run it. I have discovered one container that does not have a LV as backing store, despite the fact that it was created recently. I was going to investigate a bit more. Is it possible that one of the other minor releases in the 2.8 series had a bug that could have resulted in a dir backing store instead of an LV? Is it possible for an individual container to have a container-specific setting that differs from the default? (I certainly did not do it on purpose.) I think that container is what stopped LXD in the last run, because it is alphabetically next after the successful sequence; so we may be very close.

stgraber commented 7 years ago

I can't remember any bug in the LVM code which would have caused LXC to create a directory backed container rather than LVM when the storage.lvm_vg_name key is set.

It's the kind of thing that could happen with the btrfs happen because LXD would silently fallback if btrfs isn't detected anymore, but for LVM we only check the storage key...

brauner commented 7 years ago

You could probably cause this by creating a directory container and then switching to LVM by setting the config key. Will try to reproduce. In any case, we might need to handle this corner case as well and revert dir-backed containers detected during upgrade to LVM-backed containers.

brauner commented 7 years ago

So I managed to create a mixed-storage LXD instance {DIR, LVM} by doing the following: asciicast

So this means we should handle this case as well and migrate DIR-backed containers to LVM-backed containers on upgrade.

brauner commented 7 years ago

Actually, it might be possible to create these mixed-storage instances in any {LVM,<valid-storage-type>} combination. Switching from LVM back should not be possible since LXD will refuse.

brauner commented 7 years ago

Yup, reproduced creating a zfs pool first, creating a container and then switching to lvm on a 2.8 instance.

brauner commented 7 years ago

I've got code coming that will handle this.

sbworth commented 7 years ago

@brauner I am afraid that it is even stranger than that, as the non-LVM container was clearly created long after many of the LVM containers. My best guess is that I created it on my other LXD server, which has only dir storage and migrated it to the LVM-backed server. Normally, this resulted in an LVM server on the receiving end, but in this one case I ended up with a dir server.

In preparation for the multi-storage support, was there a release in which you moved the storage setting from a global variable to a per-container variable, which might have been picked up -- and implemented -- by my LVM-backed server, but ALSO preceding the introduction of the new storage-pools area?

Since dir storage was the original, and simplest, storage format, my LVM-backed server, if it were at the exact right version when receiving the "lxc move foo sys2:foo", could have implemented the container-specific storage type by simply writing directly into /var/lib/lxd/containers/foo/rootfs.

Possible?

sbworth commented 7 years ago

Strike my last comment. Looking inside the sqlite tables, we see that the creation date timestamp was 1480368405 which converts to 2016-11-28; so it is in fact one of the oldest containers and so probably predates the LVM conversion. It is expendable; so I am going to move it aside, delete the entries from the lxd.db and try @stgraber's latest patched LXD.

brauner commented 7 years ago

well, I'm sending a patch that handles such upgrades

brauner commented 7 years ago

So ok, here's an upgrade from a pure LVM LXD pre-storage-api LXD instance (without DIR containers in the mix): asciicast

And here's an upgrade from a LVM LXD pre-storage-api LXD instance which mixes LVM and DIR containers: asciicast

brauner commented 7 years ago

Note, that moving a DIR container to a LVM container will not set LV Thin origin name which you can see in lvdisplay output. This however, shouldn't be a problem in terms of functionality.

sbworth commented 7 years ago

I just ran the last LXD from @stgraber. It migrated all of the containers and renamed the volumes to containers_{foo}. It eventually bombed out on migrating the images, but I think that that may again be due to residue left behind by the pre-LVM storage.

WARN[03-07|09:58:29] Database already contains a valid entry for the storage pool: lxd. 
WARN[03-07|09:58:29] Storage volumes database already contains an entry for the container. 
WARN[03-07|09:58:29] Storage volumes database already contains an entry for the container. 
WARN[03-07|09:58:29] Storage volumes database already contains an entry for the container. 
WARN[03-07|09:58:29] Storage volumes database already contains an entry for the container. 
WARN[03-07|09:58:30] Storage volumes database already contains an entry for the container. 
WARN[03-07|09:58:30] Storage volumes database already contains an entry for the container. 
error: Failed to run: lvrename lxd b5b03165de7c450f5f9793c8b2eb4a364fbd81124a01511f854dd379eef52abb images_b5b03165de7c450f5f9793c8b2eb4a364fbd81124a01511f854dd379eef52abb: Existing logical volume "b5b03165de7c450f5f9793c8b2eb4a364fbd81124a01511f854dd379eef52abb" not found in volume group "lxd"

These new image volumes were created (names truncated for presentation):

  LV                  VG   Attr        LSize   Pool   Origin   Data%  Meta%
  images_11fc...d172 lxd  Vwi-a-tz-- 300.00g LXDPool   1.85
  images_18e7...5533 lxd  Vwi-a-tz--  10.00g LXDPool   8.38
  images_457a...6595 lxd  Vwi-a-tz--  10.00g LXDPool   5.65
  images_543e...3d15 lxd  Vwi-a-tz--  10.00g LXDPool   8.59
  images_a570...e869 lxd  Vwi-a-tz--  10.00g LXDPool   13.85

and reflected in /var/lib/lxd/storage-pools/lxd/images as a bunch of empty directories:

root@sys2:/var/lib/lxd/storage-pools/lxd# ls -l images
total 24
drwx------ 2 root root 4096 Mar  7 09:58 11fc...d172
drwx------ 2 root root 4096 Mar  7 09:58 18e7...5533
drwx------ 2 root root 4096 Mar  7 09:58 457a...6595
drwx------ 2 root root 4096 Mar  7 09:58 543e...3d15
drwx------ 2 root root 4096 Mar  7 09:58 a570...3e869
drwx------ 2 root root 4096 Mar  7 09:58 b5b0...52abb

My /var/lib/lxd/images looks like it is full of cruft. (Again filenames truncated for presentation.):

root@sys2:/var/lib/lxd/images# ls -l
total 1448392
-rw-r--r-- 1 root root       844 Mar  3 21:37 11fc...d172
-rw-r--r-- 1 root root 153494344 Mar  3 21:40 11fc...d172.rootfs
-rw-r--r-- 1 root root 199032954 Jan 10 10:45 18e7...5533
-rw-r--r-- 1 root root       600 Jan  9 13:47 457a...6595
-rw-r--r-- 1 root root  98614308 Jan  9 13:47 457a...6595.rootfs
-rw-r--r-- 1 root root       596 Nov 24 23:36 543e...3d15
-rw-r--r-- 1 root root 130996352 Nov 24 23:36 543e...3d15.rootfs
-rw-r--r-- 1 root root 418665102 Nov 30 15:05 a570...e869
-rw-r--r-- 1 root root       816 Nov 24 15:42 b5b0...2abb
-rw-r--r-- 1 root root 125798024 Nov 24 15:42 b5b0...2abb.rootfs
-rw-r--r-- 1 root root       600 Nov 24 23:40 bfd1...f9f8
-rw-r--r-- 1 root root   2487084 Nov 24 23:40 bfd1...f9f8.rootfs
-rw-r--r-- 1 root root       596 Jan  7 17:51 d7c1...3cc6
lrwxrwxrwx 1 root root        73 Jan  7 17:51 d7c1...3cc6.lv -> /dev/lxd/d7c16c4fedd3308b5bffdb91f491b8458610c6115d37ace8ba4bcf5c29b23cc6
-rw-r--r-- 1 root root   2756944 Jan  7 17:51 d7c1...3cc6.rootfs
-rw-r--r-- 1 root root 285455980 Feb 14 13:28 e12c...eeed
lrwxrwxrwx 1 root root        73 Feb 14 13:28 e12c...eeed.lv -> /dev/lxd/e12c3c1aed259ce62b4a5e8dc5fe8b92d14d36e611b3beae3f55c94df069eeed
-rw-r--r-- 1 root root       596 Nov 24 23:39 ff52...3ac28
-rw-r--r-- 1 root root  65777664 Nov 24 23:39 ff52...3ac28.rootfs

I'm thinking that I may just need to move aside all of the images that have no reference to a logical volume and then run LXD again. Does that seem plausible to you guys?

Here is some formatted data from lxd.db images tables:

sqlite> select id,cached,fingerprint,filename,creation_date,last_use_date from images;
| id | cached | fingerprint | filename                                      | creation_date             | last_use_date                       |
|  2 |      0 | b5b0...2abb | ubuntu-14.04-server-cloudimg-amd64-lxd.tar.xz | 2016-11-09 00:00:00+00:00 |                                     |
|  3 |      0 | 543e...3d15 | lxd.tar.xz                                    | 2016-11-25 00:00:00+00:00 | 2016-11-30 21:37:31.630646328+00:00 |
|  4 |      0 | ff52...ac28 | lxd.tar.xz                                    | 2016-11-25 00:00:00+00:00 |                                     |
|  8 |      0 | a570...e869 |                                               | 0001-01-01 00:00:00+00:00 |                                     |
| 50 |      0 | d7c1...3cc6 | lxd.tar.xz                                    | 2017-01-07 00:00:00+00:00 | 2017-01-13 19:34:34.399231656+00:00 |
| 51 |      0 | 457a...6595 | lxd.tar.xz                                    | 2017-01-08 00:00:00+00:00 | 2017-01-09 18:48:43.133706794+00:00 |
| 52 |      1 | 18e7...5533 |                                               | 0001-01-01 00:00:00+00:00 | 2017-01-10 15:45:22.877654385+00:00 |
| 55 |      0 | e12c...eeed |                                               | 0001-01-01 00:00:00+00:00 | 2017-02-14 18:33:03.41803276+00:00  |
| 60 |      0 | 11fc...d172 | ubuntu-16.04-server-cloudimg-amd64-lxd.tar.xz | 2017-03-03 00:00:00+00:00 | 2017-03-04 03:47:50.051114525+00:00 |

Finally, the contents of lxd.db table storage_volumes. Note that the images all come at the end with the partially obfuscated containers above. I also see references to containers that were deleted (with foreign_keys=ON):

id|name|storage_pool_id|type
1|a.....3|1|0
2|a...4|1|0
3|a.......d|1|0
4|a.....w|1|0
5|a......t|1|0
6|b.....r|1|0
7|c.............e|1|0
8|d...|1|0
9|e......t|1|0
10|f......5|1|0
11|foo|1|0
12|h........w|1|0
13|h........w|1|0
14|j....2|1|0
15|l.........e|1|0
16|l....2|1|0
17|..0|1|0
18|..1|1|0
19|.......2|1|0
20|.......4|1|0
21|p......p|1|0
22|p........l|1|0
23|p.....t|1|0
24|p...........2|1|0
25|p..|1|0
26|s.............4|1|0
27|s...........e|1|0
28|test1|1|0
29|test2|1|0
30|u.....s|1|0
31|v...t|1|0
32|v....1|1|0
33|v...|1|0
34|w.....k|1|0
35|11fc1b1d39b9f9cd7e9491871f1421ac4278e1d599ecf5d180f2a6e2483bd172|1|1
36|18e7ed74d0d653894f65343afbc35b92c6781933c273943d882c36a5c5535533|1|1
37|457a80ea4720900b69e5542cea5351f58021331bc96e773e4855a3e2ce1e6595|1|1
38|543e662b70958f5b87f68b20eb0a205d8c4b14c41f80699e9a98b3b851883d15|1|1
39|a570ce23e1dae791e7b8b2f2bcb98c1404273e97c7a1fb972bf0f5835ac3e869|1|1
40|b5b03165de7c450f5f9793c8b2eb4a364fbd81124a01511f854dd379eef52abb|1|1

OK. Enough for now.

brauner commented 7 years ago

Do

brauner commented 7 years ago

Ah, I see the upgrade failure may well be due, as you correctly observed, that the image used to create the dir container is still around and does not exist as a LVM logical volume. The new upgrade code for mixed-storage LXD instances should handle this.

sbworth commented 7 years ago

Yes, those images exist and are dependencies of some containers. I was just putting together an orderly table view of the output of lvs:

| LV                          | VG  | Attr               | LSize   | Pool                     | Origin | Data% |
|-----------------------------+-----+--------------------+---------+--------------------------+--------+-------|
| LXDPool                     | lxd | twi-aotz--   2.56t |         |                          |   3.92 |  2.13 |
| containers_......3          | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  21.34 |       |
| containers_....4            | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  53.18 |       |
| containers_.......w         | lxd | Vwi-aotz--  10.00g | LXDPool | images_18e7...5533 42.01 |        |       |
| containers_.......t         | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  10.28 |       |
| containers_......r          | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  13.78 |       |
| containers_...............e | lxd | Vwi-aotz--  10.00g | LXDPool | images_543e...3d15 8.77  |        |       |
| containers_.......t         | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  16.87 |       |
| containers_.......5         | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  31.73 |       |
| containers_...              | lxd | Vwi-aotz-- 300.00g | LXDPool | images_11fc...d172 1.85  |        |       |
| containers_..........w      | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  19.22 |       |
| containers_..........w      | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  11.45 |       |
| containers_.....2           | lxd | Vwi-aotz--  10.00g | LXDPool |                          |   9.85 |       |
| containers_...........e     | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  40.40 |       |
| containers_.....2           | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  98.77 |       |
| containers_..0              | lxd | Vwi-aotz-- 300.00g | LXDPool |                          |   1.94 |       |
| containers_..1              | lxd | Vwi-aotz-- 300.00g | LXDPool |                          |   1.93 |       |
| containers_.......2         | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  11.84 |       |
| containers_.......4         | lxd | Vwi-aotz-- 300.00g | LXDPool |                          |   1.88 |       |
| containers_........p        | lxd | Vwi-aotz-- 300.00g | LXDPool |                          |   1.85 |       |
| containers_..........l      | lxd | Vwi-aotz-- 300.00g | LXDPool |                          |   1.90 |       |
| containers_......t          | lxd | Vwi-aotz--  10.00g | LXDPool | images_543e...3d15 8.59  |        |       |
| containers_.............2   | lxd | Vwi-aotz-- 300.00g | LXDPool |                          |   1.95 |       |
| containers_..b              | lxd | Vwi-aotz-- 300.00g | LXDPool |                          |   2.11 |       |
| containers_...............4 | lxd | Vwi-aotz--  10.00g | LXDPool | d7c1...3cc6        3.42  |        |       |
| containers_.............e   | lxd | Vwi-aotz--  10.00g | LXDPool | images_457a...6595 5.71  |        |       |
| containers_....1            | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  12.15 |       |
| containers_....2            | lxd | Vwi-aotz--  10.00g | LXDPool |                          |   9.62 |       |
| containers_......s          | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  69.05 |       |
| containers_....t            | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  12.34 |       |
| containers_.....1           | lxd | Vwi-aotz-- 300.00g | LXDPool |                          |   1.85 |       |
| containers_...1             | lxd | Vwi-aotz-- 300.00g | LXDPool |                          |   1.88 |       |
| containers_......k          | lxd | Vwi-aotz--  10.00g | LXDPool |                          |  42.80 |       |
| d7c1...3cc6                 | lxd | Vwi-a-tz--  10.00g | LXDPool |                          |   2.95 |       |
| e12c...eeed                 | lxd | Vwi-a-tz-- 300.00g | LXDPool |                          |   1.89 |       |
| f..........1                | lxd | Vwi-aotz--  40.00g | LXDPool |                          |  34.53 |       |
| images_11fc...d172          | lxd | Vwi-a-tz-- 300.00g | LXDPool |                          |   1.85 |       |
| images_18e7...5533          | lxd | Vwi-a-tz--  10.00g | LXDPool |                          |   8.38 |       |
| images_457a...6595          | lxd | Vwi-a-tz--  10.00g | LXDPool |                          |   5.65 |       |
| images_543e...3d15          | lxd | Vwi-a-tz--  10.00g | LXDPool |                          |   8.59 |       |
| images_a570...e869          | lxd | Vwi-a-tz--  10.00g | LXDPool |                          |  13.85 |       |
| lphys-home                  | lxd | Vwi-aotz-- 400.00g | LXDPool |                          |   2.82 |       |
brauner commented 7 years ago

Cool, yeah those other image failures are caused by them not being present as LVM logical volumes since they were used to create dir containers. Thanks for your help!

sbworth commented 7 years ago

I'm glad that my pain could be of value. ;-)

sbworth commented 7 years ago

If I see:

-rw-r--r-- 1 root root     596 Jan  7 17:51 d7c16c4fedd3308b5bffdb91f491b8458610c6115d37ace8ba4bcf5c29b23cc6
lrwxrwxrwx 1 root root      73 Jan  7 17:51 d7c16c4fedd3308b5bffdb91f491b8458610c6115d37ace8ba4bcf5c29b23cc6.lv -> /dev/lxd/d7c16c4fedd3308b5bffdb91f491b8458610c6115d37ace8ba4bcf5c29b23cc6
-rw-r--r-- 1 root root 2756944 Jan  7 17:51 d7c16c4fedd3308b5bffdb91f491b8458610c6115d37ace8ba4bcf5c29b23cc6.rootfs

and absent any dir-backed containers, are the plain file versions just leftovers that we can safely clear away. (Perhaps LXD should do the clearing to ensure database consistency, but I am thinking more narrowly of not missing any container dependencies.)

brauner commented 7 years ago

That depends on whether the corresponding image has an entry in the images database. If that is the case then simply deleting the image from the folder will leave you with an entry for this image in the db. As soon as @stgraber is around he can provide you with a pre-built binary based on my patches that will likely take care of this issue. If you can't wait that long, then do:

sbworth commented 7 years ago

I am content to wait for now for the @stgraber updated LXD. I simply want to understand the dependencies as well as possible for future reference. I'm going to create a new issue with a suggestion for tweaking your handling of image/container updates with safety in mind. (You guys, of course, get to decide whether it make sense.)

stgraber commented 7 years ago

Sorry about that, worked till 3am so woke up pretty late today :) Will build you a new binary now.

stgraber commented 7 years ago

@sbworth

Binary updated: https://dl.stgraber.org/lxd-3026 sha256: ad96813bb5ecc29dde483b48ee284682df3f128d7b1006f2c313300585970bdf

sbworth commented 7 years ago

@stgraber We have success.

Should I continue to run this lxd, or should the 2.10.1 version now suffice until the next upgrade?

stgraber commented 7 years ago

SInce the changes we made were limited to the upgrade code and you've now gotten past that upgrade, you can resume using 2.10.1. We'll have LXD 2.11 out later today with the fix.

stgraber commented 7 years ago

And merged that last batch of fixes from @brauner so all the fixes needed to sort out this issue are now in the master branch, closing this issue.

sbworth commented 7 years ago

Now running successfully with 2.10.1 lxd in post-migration state.

Thanks again @stgraber @brauner .