qgis / QGIS-Enhancement-Proposals

QEP's (QGIS Enhancement Proposals) are used in the process of creating and discussing new enhancements for QGIS
118 stars 37 forks source link

Create QGIS Docker Official image on docker hub #142

Open doublebyte1 opened 5 years ago

doublebyte1 commented 5 years ago

Create QGIS Docker Official image on docker hub

Date 2019/04/15

Author Joana Simoes (@doublebyte1)

Contact joana at doublebyte dot net

maintainer @doublebyte1

Version QGIS 2.x, 3.x

Summary

Docker Official Images are a curated set of Docker repositories hosted on Docker Hub. New Docker users are encouraged to use the Official Images in their projects. These images have clear documentation, promote best practices, and are designed for the most common use cases. Advanced users are encouraged to review the Official Images as part of their Dockerfile learning process.

Official images are generally seen as a reference, and while it is not strictly mandatory, they are generally maintained by the upstream software authors.

Official images are different from regular images, because they have no namespace. For instance, the geonetwork official image can be pulled like this:

docker pull geonetwork

Currently, there are many QGIS docker images on github, including one released by the QGIS organization. However, up to this date there is still no official docker image for QGIS.

There are many advantages in having a QGIS official docker image:

This QEP aims to create a QGIS official image.

Submission Process

These are the steps in creating a Docker official image:

  1. Create the image code.
  2. Fork the (docker library official images)[https://github.com/docker-library/official-images.git] repository.
  3. Add the QGIS official image to the library.
  4. Fork the (docker library docs)[https://github.com/docker-library/docs.git] repository
  5. Add the QGIS docs to the library.
  6. Prepare and submit pull requests for the official image and docs of QGIS.

Once the pull request is merged by the Docker team, the official QGIS image will be available on docker hub:

docker pull geonetwork

Documentation is automatically generated from the docs and an automated build is setup, to build the image from the repository.

Whenever we make changes in the QGIS image code (e.g.: adding a new version), we need to submit another pull request to the docker library repository.

From my experience, the initial process may take a while, as the image will be audited by the docker team (e.g.: security, best practices), and some changes may be required. Nevertheless, the maintenance of the image is usually a quick process.

Image Code

This is the most critical step in creating the docker official image.

As the repository needs to follow a certain structure, the best approach would be to create a repository on the QGIS organization. Something like:

https://github.com/QGIS/docker-qgis

If you create this repository and give me access, I can start working on the structure.

It would be good to start with an image that has already been tested. Ideally, the official image should not be built from a nightly build (or something similar), but instead it should be created from a release. Release versions are more stable and users are encouraged to use them, either to learn the application or in production environments. If releases are available from a package manager, or from a specific repository, it would be good to retrieve those in the Dockerfile. approach This should also result in a lighter image, which can be pulled in less time. If different major versions of QGIS are supported (e.g.: 2.x, 3.x), it would be good to also support them in the official images. Likewise, we can also provide different minor versions.

One we have a base Dockerfile, we can start working on any improvements regarding implementing the Docker best practices.

Documentation

The image documentation is automatically generated and follows a template on the docker library. This is an example of documentation for the geonetwork official image. We should think of additional flags for running the container which may be usefull for the user (e.g.: mounting a directory on the host) and provide an example docker-compose file.

Proposed Next Steps

These are the proposed next steps:

  1. Create a repository for the docker image on the QGIS organization
  2. Choose a QGIS base image
  3. Decide on which version(s) we want to ship
  4. Work on preparing the image code to follow the docker library standards
  5. Prepare documentation
  6. Create pull requests

Further Considerations/Improvements

I am not sure if QGIS server should be on this image or on a separate image. Maybe we could discuss that.

Votes

(required)

vpicavet commented 5 years ago

Hi, First step would probably be a list of existing QGIS Docker files. There are quite a lot of them out there.

Then we should definitely describe the target user of this image : is it a end-user, a system administrator, what platform should it run on exactly, which versions of docker we want to support ( can be different according to linux distros versions and windows).

As version 2 is not supported any longer, we should not have any container for it neither.

And the server should definitely be in a specific container : it does not target the same users, and we do not want fat containers for server-side containers.

Note that we already have containers for QGIS server and build containers too here :  https://github.com/Oslandia/docker-qgis

Also for the testing environment : https://github.com/Oslandia/docker-qgis-ogc-cite

pcav commented 5 years ago

+1 Reducing noise here would be very good IMHO. Thanks.

rduivenvoorde commented 5 years ago

Just for some completeness, I understand that this is for a QGIS Desktop docker, but:

We have some dockers to build the docs: https://github.com/qgis/QGIS-Sysadmin/tree/master/docker/sphinx We have a docker to handle stripe: https://github.com/qgis/QGIS-Stripe We have a docker hub account: https://hub.docker.com/u/qgis

Planning to use https://github.com/qgis/QGIS-Documentation/blob/master/doctest.dockerfile to be able to test the python in the QGIS python cookbook.

Tim did a lot of original Docker experiments: https://github.com/kartoza/docker-qgis-desktop

m-kuhn commented 5 years ago

Great proposal, having an official Docker image and docs sounds great.

Currently, there are many QGIS docker images on github, including one released by the QGIS organization. However, up to this date there is still no official docker image for QGIS.

A considerable amount of time has gone into creating these images. They are available for releases as well as for nightlies. Building them is integrated into continuous integration and an extensive README as documentation is available. These images are widely used already and guaranteed to be maintained by the QGIS developers because they are integrated into CI testing on the QGIS repository.

I.e. I think a lot of the requirements in here are already in place (shipping several stable release versions, maintained by upstream).

The main downside of these images is that QGIS is built within Docker and therefore it's (almost) inevitable to have a larger size because of the build dependencies which are impossible to keep separate.

However, if this last point is not critical (currently it's 1GB size) I would very much like to see us work on a single docker image instead of having the community maintain 2 different docker images ;).

There are certainly still plenty of things to do that you listed in this proposal which would be great to improve anyway.

Do you see any other downsides to improve the currently existing images instead of starting a new project which I didn't think of?

pcav commented 5 years ago

I would seriously avoid all building stack in the image, partly because of size (of course, the smaller the better), more so because of security concerns.

m-kuhn commented 5 years ago

more so because of security concerns.

how so?

pcav commented 5 years ago

Many pre-packaged exploits require functional compilers.

m-kuhn commented 5 years ago

The build deps can be removed from the system, so no longer available for usage or exploits. But due to the overlay2 file system they will still be visible in the size of the image.

On the other hand, one thing that would be a big plus in terms of security and stability is that we can also ship better dependencies in a docker image. There are quite often issues which are solved in upstream libraries (sip/geos come to my mind) which are often not available in any distribution in a reasonable time.

But still out of curiosity, do you have a reference for "Many pre-packaged exploits require functional compilers."?

pcav commented 5 years ago

An old good practice. See e.g. https://www.assistanz.com/server-hardening-scripts-for-cpanel-2/

elpaso commented 5 years ago

I don't get the security concern for the desktop image: what's the idea here? To use the docker images for QGIS Server in production? If that's the goal, I feel that we should have a separate image for QGIS Server (you don't need to install all the desktop stuff on a sever), and I agree to remove anything out of the bare essential, a sysadmin is supposed to be able to install extra packages if required.

haubourg commented 5 years ago

These images are widely used already and guaranteed to be maintained by the QGIS developers because they are integrated into CI testing on the QGIS repositor

Hi, I think we need to keep inline with those efforts. How about integrating deb package creation in the end of the build process, and install them on another clean image(s) dedicated to production use. This is exactly what is done at https://github.com/Oslandia/docker-qgis Side note, the qgis server image does not include any web server, we use other images to run apache or nginx and get things clearly separated. how does it sounds @elemoine ?

m-kuhn commented 5 years ago

we use other images to run apache or nginx and get things clearly separated.

Combined with docker-compose examples to get people started easily that sounds like a good approach :+1:

elemoine commented 5 years ago

Hi all

As the main author of https://github.com/Oslandia/docker-qgis I am interested in participating to the effort of creating official QGIS Docker images. I like the idea of high-quality Docker images supported by the QGIS community, that people can use for executing QGIS Desktop and QGIS Server.

I agree with @vpicavet and @elpaso that we should create separate images for QGIS Desktop and QGIS Server. This is to keep the QGIS Server image as small possible, and making it more secure by including the minimum necessary.

Also, in order to keep images as small as possible the images shouldn't include build dependencies. So if we want/need to compile QGIS as part of the image build process I'd suggest that we rely on a two-step build process, as mentioned by @haubourg, or rely on Docker's multi-stage bulids functionality. But in my opinion we should first focus on building images for released versions of QGIS, using Debian packages fetched from https://qgis.org/debian/.

For QGIS Server I'd also vote for Docker images not including any web server. This is for system administrators to use the web server of their choice, which may be a web server instance that already exists in the infrastructure and that they already use for other services. And +1 with @m-kuhn on somehow providing a docker-compose example.

doublebyte1 commented 5 years ago

Hi all,

Nice to read all this feedback!

IMHO, having docker images as part of nightly builds is great for developers, but that is not really the goal of docker official images. Official images should match stable releases (or even LTRs), because they are aimed at people who just want to try QGIS, or maybe use it as another tool in their pipeline of analysis and may not necessarily be QGIS(or GIS) experts. Having a fat image of 1GB is really an overkill for these use cases. If you have a look at other docker official images, for instance nginx, you will notice that often they download the package from a repository. Another advantage of not building QGIS whithin the image is that it is easier to maintain. We just need to update the name of the package, and we don't need to worry about updating dependencies or compiling flags (or even compilers). I think that those concerns belong to the upstream maintainers of the package itself, not the maintainers of the dockerfile.

Unfortunately the option of building a debian package and install it locally would not work for official images, as they are built from code hosted on github (remember: the official images have strict rules!). We could certainly build these packages somewhere, upload them to a repository, and then call them within the image. However, I don't see a big advantage on this approach as opposed to downloading packaged releases, unless we want to support nighly builds.

Regarding the QGIS server image, would it be an option to build it based on the QGIS desktop image? If it is just a matter of configuration, we could use that image to provide the software and then add all the server side configurations (for instance exposing ports). Like that we could provide two different images for accomodating the different use cases, but they would be much easier to maintain.

elemoine commented 5 years ago

IMHO, having docker images as part of nightly builds is great for developers, but that is not really the goal of docker official images. Official images should match stable releases (or even LTRs), because they are aimed at people who just want to try QGIS

+1

and also people wanting to run QGIS Server in production.

Another advantage of not building QGIS whithin the image is that it is easier to maintain. We just need to update the name of the package, and we don't need to worry about updating dependencies or compiling flags (or even compilers). I think that those concerns belong to the upstream maintainers of the package itself, not the maintainers of the dockerfile.

+1, fully agree

However, I don't see a big advantage on this approach as opposed to downloading packaged releases, unless we want to support nighly builds.

+1. Nightly builds are not the priority in my opinion. What we want first is QGIS images based on released versions of QGIS.

Regarding the QGIS server image, would it be an option to build it based on the QGIS desktop image?

Not really I think, because we want the QGIS Server image to be as small as possible.

However, we could base the QGIS Desktop and QGIS Server images on a common QGIS base image. In this way we can probably avoid duplication in Dockerfiles.

Thanks a lot @doublebyte1 for opening this QEP!

doublebyte1 commented 5 years ago

Not really I think, because we want the QGIS Server image to be as small as possible.

However, we could base the QGIS Desktop and QGIS Server images on a common QGIS base image. In this way we can probably avoid duplication in Dockerfiles.

+1

m-kuhn commented 5 years ago

Currently QGIS server packages (i.e. the providers of it) are built WITH_GUI because it's the same package (qgis-common) that is shared with the Desktop package, this leads to a slightly increased dependency and library size footprint too (which could be optimized by compiling the server packages separately or by modularizing the code). It's no blocker whatsoever, just a friendly reminder.

nyalldawson commented 5 years ago

Just for reference -- I think there's also going to be a strong case for an image for the processing standalone tool described here: https://github.com/qgis/QGIS-Enhancement-Proposals/issues/140

(This would be built without gui, app, or server.)

elemoine commented 5 years ago

Currently QGIS server packages (i.e. the providers of it) are built WITH_GUI because it's the same package (qgis-common) that is shared with the Desktop package, this leads to a slightly increased dependency and library size footprint too (which could be optimized by compiling the server packages separately or by modularizing the code). It's no blocker whatsoever, just a friendly reminder.

Yep, thanks for the reminder :-)

That being said, there already is a significant size difference between the "qgis server" and "qgis desktop" images I created: 378 MB (server) vs 704 MB (desktop).