qgis / QGIS-Enhancement-Proposals

QEP's (QGIS Enhancement Proposals) are used in the process of creating and discussing new enhancements for QGIS
116 stars 37 forks source link

QGIS Server and sysadmin #190

Open pblottiere opened 4 years ago

pblottiere commented 4 years ago

QGIS Server and sysadmin

Date 2020/06/24

Author Paul Blottiere (@pblottiere)

Contact blottiere.paul@gmail.com

maintainer @pblottiere

Version QGIS 3

Summary

While we have several web clients above QGIS Server, the management of the process itself is rather opaque. Some discussions have already been raised on several occasions ([0], [1] or informal discussions) and a developer meeting is expected soon, so it is likely to be discussed.

The aim of this QEP is firstly to summarize the current situation, track the needs scattered throughout the community and then discuss about the implementation.

The conclusion may be that we don't have anything to do on QGIS Server side (except some adjustments), but instead let integrators build applications on top of it.

Wishlist

Get an insight of the state of QGIS server:

Configure a QGIS Server on the fly:

Beyond that, there's clearly a need for simplicity and a willingness to have "something" working out of the box without another extra component that needs extra installation.

Proposed solution

These needs have to be discussed here and/or during the meeting developer because the implementation depends on the technical orientations we want to give to QGIS Server (what we want to do and what we DON'T want to do). In this discussion, the security aspect is important, especially when it comes to QGIS Server on the fly configuration.

Several ideas naturally come up:

An important element to underline is the fact that in real world installations, several QGIS Server instances are running behind a load balancer. So how could we provide a way to manage their configuration in a unified way? And do we want to?

Python plugins

The first solution is to consider that we can already do everything we need thanks to Python plugins (modulo some API adjustments).

While retrieving internal information may be considered unnecessary (because a sysadmin has access to them), it's pretty easy to implement. However, allowing on the fly configuration of QGIS Server is probably dangerous in this way.

A web admin panel

Another solution is to add a native web admin panel alongside with QGIS Server. While it seems a bad idea to tie up QGIS Server code with a web application, a possibility is to provide a way to retrieve information and configure qgis server in a secure way (how? a dedicated socket?).

Then, we could provide a single reference client living in the qgis source code, closely tracking changes in master (but in a dedicated directory to have a clear separation between the map server part and the frontend world).

Another solution could be to create a dedicated repository for the web application, but it doesn't seem appropriate for the sake of simplicity.

A new request/service in qgis server core

Some work has already been done a few months ago in this way but has not been merged (cf the next PR).

Indeed, this PR adds a very valid GetServerSettings request but raises the fact that it can be done with a dedicated Python plugin. Do we really want to add in-core requests/services for something we can do with plugins?

While this PR focuses on retrieving internal information, configure QGIS Server through a dedicated request/service also raise some questions:

A dedicated admin socket

Another solution (I just come up with this idea, it may be stupid ^^) is just to add a dedicated socket for all the admin aspects.

The socket may be configured to bind with the local interface only to avoid security issues, but it's at the sysadmin's discretion. This way, an integrator is able to retrieve information from a QGIS Server instance and configure its behavior in a secure way, from a web interface, a plugin or any process.

But while it seems possible with the new HTTP server (thanks @elpaso), it doesn't seem possible with a FCGI process.

Performance Implications

TODO.

Backwards Compatibility

TODO.

nyalldawson commented 4 years ago

Another solution is to add a native web admin panel with QGIS Server. I (we?) don't think it's a good solution but it's a "valid" technical solution.

My 2c: I'd love to see this happen. Lack of simple admin tools is certainly a step back when moving from something like ArcGIS server/geoserver to qgis server

pblottiere commented 4 years ago

I tried to summarize the situation.

Let me know what you think @andreasneumann @Gustry @m-kuhn @elpaso @jgrocha @rldhont @haubourg :)

pblottiere commented 4 years ago

Another solution is to add a native web admin panel with QGIS Server. I (we?) don't think it's a good solution but it's a "valid" technical solution.

My 2c: I'd love to see this happen. Lack of simple admin tools is certainly a step back when moving from something like ArcGIS server/geoserver to qgis server

@nyalldawson I also think it would be a good thing, but not tied to QGIS Server itself. I'd prefer to provide a way/an entry point to retrieve information and configure QGIS Server (in a secure way) which could then be used by a dedicated web app.

This way, we have a clear separation between the map server itself and the app/frontend world.

nyalldawson commented 4 years ago

This way, we have a clear separation between the map server itself and the app/frontend world.

Point taken -- but I think from an end-user perspective this tool should come available out of the box in a standard qgis server install. And we should have a single reference client living in the qgis source, closely tracking changes in master.

pblottiere commented 4 years ago

This way, we have a clear separation between the map server itself and the app/frontend world.

Point taken -- but I think from an end-user perspective this tool should come available out of the box in a standard qgis server install. And we should have a single reference client living in the qgis source, closely tracking changes in master.

I think I agree but I still have to reflect on that (especially on the "we should have a single reference client living in the qgis source" philosophy versus "let integrators build applications on top of it").

I'm going to update the core of the QEP accordingly because it leads to 2 fundamentals questions:

Thanks @nyalldawson :+1:

nyalldawson commented 4 years ago

especially on the "we should have a single reference client living in the qgis source" philosophy versus "let integrators build applications on top of it"

I feel quite strongly that we should provide a reference implementation out of the box. (We could add a cmake switch to disable it if desired). I see it as very similar to the Wayland/Weston situation, where 3rd party applications (gnome, kde) can utilise the Wayland protocol for their own uses but there IS a comprehensive reference implementation available via Weston....

pblottiere commented 4 years ago

especially on the "we should have a single reference client living in the qgis source" philosophy versus "let integrators build applications on top of it"

I feel quite strongly that we should provide a reference implementation out of the box. (We could add a cmake switch to disable it if desired)

While I totally understand that point of view, I'm also influenced by the old fashion KISS philosophy.

Yes an in-core integration provides stability and integrity, but I'm wondering about the underlying developer. If he works on a low level PR and breaks something on the webapp part, he also has to take care of it to merge its PR. Yes it may be "normal", but it may also be a curb for some low level C++ developers. QGIS Server could be considered as a dependency and a smart continuous integration could raise warnings on the dedicated official web app repository accordingly. Then, the "webapp team" could take care of it.

That said, I'm just thinking out loud, I don't have a strong opinion for now :).

wonder-sk commented 4 years ago

Exciting to see this QEP! Better/easier introspection into QGIS server would be very useful (and configuration as well).

I would agree with Nyall - having a web interface would definitely help to make the use of QGIS server smoother.

It would be great if we could have a couple of status/debug/admin endpoints that would accept/return JSON data, with stable API promise, so developers can use them programmatically... and on top of that, have a built-in web interface which would use the same endpoints (i.e. no "private" APIs) just for the web interface.

Some bits from my wishlist:

andreasneumann commented 4 years ago

A QGIS server status and management console would be really appreciated and would take away the "blackbox" aspect we have now with QGIS server.

I agree with @nyalldawson and @wonder-sk - something that works out of the box and comes already activated with QGIS server would be appreciated. If it is a Python plugin, then it should be either enabled by default or very easy to activate and already be installed with QGIS server.

The setup of QGIS server together with one of the web clients is already complicated. If we add another extra component that needs extra installation, activation and configuration, that would be sad. Ideally, we should make the setup of QGIS server simpler and more transparent, not more complicated.

pblottiere commented 4 years ago

Hi @wonder-sk,

Thanks for jumping in the discussion :).

I would agree with Nyall - having a web interface would definitely help to make the use of QGIS server smoother.

Do you think about a web application integrated within the QGIS source code itself or in a dedicated repository?

It would be great if we could have a couple of status/debug/admin endpoints that would accept/return JSON data, with stable API promise, so developers can use them programmatically... and on top of that, have a built-in web interface which would use the same endpoints (i.e. no "private" APIs) just for the web interface.

I totally agree :+1:

Some bits from my wishlist:

  • "global info" endpoint - to show versions, contents of env variables, paths in QgsApplication, registered providers, any recent exceptions, basic system info (drives, cpus, memory, swap, system load)
  • "project info" endpoint - show details about map layers (especially if any of them failed to load), if there are any missing files (e.g. SVGs), missing datum shift grid files

I'm going to update the QEP accordingly, thanks for your input.

wonder-sk commented 4 years ago

Do you think about a web application integrated within the QGIS source code itself or in a dedicated repository?

I would prefer one integrated in QGIS source tree - that way it can be built into the server and would not require extra installation steps to get things working.

yjacolin commented 4 years ago

I read quickly but something important today in web server infrastructure is a health check service to get a right answer if the server is fine or not.

pblottiere commented 4 years ago

Hi @andreasneumann,

something that works out of the box and comes already activated with QGIS server would be appreciated

The setup of QGIS server together with one of the web clients is already complicated. If we add another extra component that needs extra installation, activation and configuration, that would be sad. Ideally, we should make the setup of QGIS server simpler and more transparent, not more complicated.

It seems pretty unanimous for now :). So I'm going to update the QEP to express this need of simplicity and the "out of the box" concept.

pblottiere commented 4 years ago

Do you think about a web application integrated within the QGIS source code itself or in a dedicated repository?

I would prefer one integrated in QGIS source tree - that way it can be built into the server and would not require extra installation steps to get things working.

Point taken :+1:

pblottiere commented 4 years ago

I read quickly but something important today in web server infrastructure is a health check service to get a right answer if the server is fine or not.

Thanks @yjacolin, I'm adding this to the wishlist :).

andreasneumann commented 4 years ago

@pblottiere - one important question: In real world installations one often has several QGIS server instances running in parallel, but sharing the same configuration. But almost always they should be managed together centrally. So in my opinion we should be looking at a server cluster, not at individual instances - or both individual instances, but also a cluster.

Which leads back to the issue that we need a central caching mechanism shared by several QGIS server instances ...

pblottiere commented 4 years ago

In real world installations one often has several QGIS server instances running in parallel, but sharing the same configuration. But almost always they should be managed together centrally.

@andreasneumann Indeed and it's the meaning of the question "How do we manage QGIS Server instances which are behind a load balancer?".

But it could be expressed more clearly, I'm taking a look.

elpaso commented 4 years ago

Very welcome proposal, I think that what we need to decide is what kind of product we image QGIS Server to be. Please have a look to: https://lists.osgeo.org/pipermail/qgis-developer/2020-June/061614.html if you want to join the discussion.

My personal vision is that QGIS Server library needs to:

We also need to provide a set of binary applications:

What is IMO out of scope for the project is an avanced full-featured client (think of Lizmap or G3Wsuite or QWC), unless it is mandated by the specifications like for instance WFS3 is.

But I do think we would need to provide a better out-of-the box experience, I personally like the way WFS3 specifications implemented that with a landing page (you can have a look to this poor video of mine's if you are not familiar with it: http://www.itopen.it/bulk/QGI%20Server%20Introduction.webm). There is also a very simple webgis client (but fully overridable through templates!!!)

So, what I would like to have is a landing page for the server, that responds to / path (with content negotiation) with a HTML page.

This is a proof of concept, implemented as an QgsServerOgcApi in a Python plugin:

qgis-server-standalone

Also note that we already have in the server:

that means we already have all the foundation we need to build web clients/APIs/service on top of it.

--- back to the topic:

I'm +1 to offer APIs to provide introspection about a running server instance as long as:

We may also need a way to blocklist specific modules/services in the future.

vpicavet commented 4 years ago

Hi, thanks for raising the topic.

My opinion : 

vpicavet commented 4 years ago

Also, I think we should start designing and documenting a full REST API as soon as possible for all these kind of services. Not sure if there are existing API we could copy or be inspired from, but it would be a good first study ( e.g. geoserver admin API ).

dmarteau commented 4 years ago

Hi,

My two cents:

AMHA, as stated by @andreasneumann: having configuration admin panel in qgis server core will be useless in cluster configuration

All usage I can see of qgis server involve several instances of the services for scalabilité, we need to pay attention not to add feature that will be defeated by the usages.

andreasneumann commented 4 years ago

All usage I can see of qgis server involve several instances of the services for scalabilité, we need to pay attention not to add feature that will be defeated by the usages.

Correct - timing wise it would make sense to first implement the shared caching mechanism, so we can later use the admin panel to monitor and manage it.

Perhaps the individual instances need a mechanism to report their (health-)state to a "higher-level" management process.

vpicavet commented 4 years ago

AMHA, as stated by @andreasneumann: having configuration admin panel in qgis server core will be useless in cluster configuration

The panel may be useless for real-world production-grade QGIS deployment, but an API endpoint for this would be great to allow to build an upper-level cluster management application.

All usage I can see of qgis server involve several instances of the services for scalabilité, we need to pay attention not to add feature that will be defeated by the usages.

I think the newcomer's use case with only one server for testing purpose is a good one. It allows for better understanding of the server configuration and behaviour, before building a more complex application for production purpose.

dmarteau commented 4 years ago

I think the newcomer's use case with only one server for testing purpose is a good one

AMHA, this may be solved elegantly by including it as a service extension - thanks to @elpaso, the server now supports REST API extension - so it can be loaded or not at runtime like any other service -

dmarteau commented 4 years ago

@vpicavet

E/ let a higher-level application implement QGIS cluster management, agregating API calls to individual QGIS servers.

Careful that in case of distributed infrastructure you will need to broadcast your request - Usually your system is behind a load balancer and on system like SWARM or Kubernetes, so you will need another channels for the configuration API endpoint.

vpicavet commented 4 years ago

@vpicavet

E/ let a higher-level application implement QGIS cluster management, agregating API calls to individual QGIS servers.

Careful that in case of distributed infrastructure you will need to broadcast your request - Usually your system is behind a load balancer and on system like SWARM or Kubernetes, so you will need another channels for the configuration API endpoint.

I do not really get your point, could you elaborate ? The higher-level application should be GETting ( or POSTing) information from/to the QGIS Server cluster nodes, so it has to know about the nodes and how to http-request them indeed. Is that the problem ?

As an alternative, a push system would allow each QGIS server node to advertise itself to the cluster management app ( identified by a configuration option in each QGIS Server node), and push information regularly.

dmarteau commented 4 years ago

I do not really get your point, could you elaborate ? The higher-level application should be GETting ( or POSTing) information from/to the QGIS Server cluster nodes, so it has to know about the nodes and how to http-request them indeed. Is that the problem ?

Let's consider the following basic configuration: you have deployed n instances qgis server using supervisor with a fcgi socket dispatcher (or a wsgi dispatcher if your embedding the server API). You are using a nginx proxy that call your wsgi dispatcher: your qgis instances are load balanced but have a single accessible end-point.

If you want to use the REST API for configuring your instances, because of your single accessible end-point, your only able to configure a single instance because you are not able to broadcast the configuration request your to all your backend instances.

The situation is quite the same with kubernetes/swarm and all other situations where qgis servers are behind load-balancers or other kind of requests dispacher. This is worst when the number of replica is dynamic.

My point is thas such configuration API is only usable in case you are able to broadcast requests (not load balancing), ant that mean you need to know exactly your backend configuration.

Such situation may be solved easily when embedding the server API, because you may use different channel for broadcasting and pass then the config request to the server directly - this is the solution used by py-qgis-server, where qgis server instances are behind 0MQ messaging and we use a specific channel for broadcasting reloading/reconfiguring messages.

dmarteau commented 4 years ago

As an alternative, a push system would allow each QGIS server node to advertise itself to the cluster management app ( identified by a configuration option in each QGIS Server node), and push information regularly.

Yes, but that's another communication channel, which is my point.

vpicavet commented 4 years ago

If you want to use the REST API for configuring your instances, because of your single accessible end-point, your only able to configure a single instance because you are not able to broadcast the configuration request your to all your backend instances.

Ok I think we are on the same level and agree. From my point of view, you would have a global endpoint for the cluster configuration app ( which would not be QGIS Server, but a dedicated application). This cluster configuration application would indeed know about individual QGIS server instances ( what you call "backend configuration") and call their API endpoints. This is what you call "broadcast" if I understand correctly.

And as you say, having a push mechanism would require a "communication channel", which could be a message bus interface or simple REST API.

My point is that it is probably better keeping a QGIS cluster configuration management system off QGIS server itself, in a separate application.

pblottiere commented 4 years ago

@dmarteau Thanks for having clarified the issue hidden behind my initial statement about "several QGIS Server instances are running behind a load balancer.".

And it seems that we reached to the same conclusion: we need another communication channel with QGIS Server. @vpicavet mentioned a "message bus interface or simple REST API" and I initially talked about a "dedicated socket admin" in the body of this QEP (which is not the way to go due to the FCGI constraint).

But now that we are globally on the same page, I'm going to write a dedicated QEP about monitoring. I already talked to @elpaso and @dmarteau (and quickly to @haubourg on IRC) about a technical solution on this aspect.

pblottiere commented 4 years ago

This QEP will now be splitted in several dedicated QEPs:

  1. Monitoring: https://github.com/qgis/QGIS-Enhancement-Proposals/issues/193
  2. Catalogue and landing pages: https://github.com/qgis/QGIS-Enhancement-Proposals/issues/192
  3. Shared cache: TODO