canonical / lxd

Powerful system container and virtual machine manager
https://canonical.com/lxd
GNU Affero General Public License v3.0
4.38k stars 931 forks source link

Single file image export for split images #13958

Open edlerd opened 2 months ago

edlerd commented 2 months ago

Required information

Issue description

The image export endpoint GET 1.0/images/:id/export is returning an inconsistent format:

I suggest in the 2nd case to pack the two files into one tar archive and stream that archive as a single file download. The tar archive can be properly named after the image (like in the 1st case) and inspected with standard archive tools by the user.

To be backwards compatible the change in semantics of the endpoint will only take effect, when a flag is set, I propose GET 1.0/images/:id/export?single-file-download.

Additionally, it would be nice if the image upload endpoint PUT 1.0/images will be extended, so it also accepts the single file tar export created above.

Steps to reproduce

  1. start an instance from a remote image to ensure a remote image is present
  2. Call the GET 1.0/images/:id/export endpoint on the now cached image. This will produce a multipart stream. If you look closely after download (i.e. with less) at the download artefact, you can see it consistes of two files which are not packed in a easy to read way for the user.
mas-who commented 2 months ago

Here are some observations from the LXD source code that may be helpful for this issue:

  1. At the client level, split image export is determined by API response Content-Type header in the code. We can't use this approach since it would require us making an API call to start the image export.

  2. At the API level, the image export code checks if a .rootfs file exist at /var/lib/lxd/images/<fingerprint> (this path could change depending on the LXD_DIR env variable but I'm just going to stick with the aforementioned image path for simplicity). If this condition is true, then it would result in a split image export. We can't really use this logic as well since the browser can't interact with host file system in this way.

  3. After a relatively quick scan over the imagesPost code, I think this is the relevant section for storing an image in /var/lib/lxd/images/<fingerprint> depending on request type. It seems that when an image is created from a remote source i.e. copy from remote with imgPostRemoteInfo and copy from url with imgPostURLInfo, they use the ImageDownload function which attempts to store the image files at the aforementioned file path. I think this correlates to our assumption in the UI currently that only downloaded images will result in a split image export. That being said, the code in imagesPost is quite involved, so I may have missed something, please let me know what you think about this.

  4. Lastly, at the moment in the UI, we assume that a downloaded image from a remote source will have the field update_source in the response body when making a request to the GET /1.0/images endpoint. This field seems to only exist for an image, if there is a related entry in the images_source table. Entries in this table is created using the CreateImageSource db function, which only gets called in the ImageDownload function mentioned in point 3 for downloading remote images.

I think given the above observations, our current approach in the UI is fine (i.e. disable export for images that have the update_source field) assuming I haven't missed anything in the code. Although this is definitely not an ideal approach since it is based on an implied relationship between multiple code sections. Things could be simplified if we can get the image format (split or unified) and/or if the image was from a remote source in the response for GET /1.0/images?