Closed batpad closed 3 months ago
Existing issue that discusses the details of what would be involved here a bit more: https://github.com/2i2c-org/infrastructure/issues/2985
From discussions with @yuvipanda, it seems like how we want to implement this:
Similar to the DesktopHandler
served at /desktop/
, we will create a handler specific to QGIS at /qgis/
and then write code to handle reading query parameters at that URL and opening QGIS with the appropriate options / configuration based on query parameters passed in.
One can see how the current DesktopHandler is setup in jupyter-remote-desktop-proxy
:
So, I think the idea here would be to create a jupyter-remote-qgis-proxy
that wraps around jupyter-remote-desktop-proxy
and creates a new handler for /qgis/
.
As a first version, let's accept something like a dataset=...
query parameter that can accept a URL for a dataset, and start QGIS with parameters to open the dataset. We can then evaluate if we need more complexity / the ability to pass in more query parameters (for eg. bbox
).
Once we have created jupyer-remote-qgis-proxy
, we can include it in the nasa-qgis-image
and test.
/qgis/?dataset=https://example.com/dataset
to have QGIS open the dataset located at https://example.com/dataset@wildintellect may need your help here figuring out what kinds of datasets can be passed in and what flags we can use when starting QGIS to open the dataset passed in correctly.
@sunu let's go over this when we next chat and figure out next steps. @yuvipanda I think I've mostly grokked what we need to do here, but it's possible that it'd be helpful to have a quick chat with you before we kick this work off in earnest.
cc @geohacker
@batpad I suspect we'll want to rely on STAC where a "collection" == "dataset" or an "item" == "dataset", a collection
would probabably need to have a web service that QGIS could use, where an item
could have an asset
defined and we could filter based on GDAL/OGR supported formats. I need to go back an look at what the QGIS STAC plugin does because hooking into that might be another approach.
To get started I think we need a user story with a particular dataset, so we can work through the process. @j08lue can you think of an relatively simple high value dataset to try from VEDA?
can you think of an relatively simple high value dataset to try from VEDA?
We often use the Nitrogen Dioxide for demo purposes: https://radiantearth.github.io/stac-browser/#/external/staging-stac.delta-backend.com/collections/no2-monthly
@batpad thanks for kicking this off! I think it would be useful for me to be in the first meeting as well, please include me in that?
@batpad For a bare bones implementation, can we clarify what a dataset is and what it involves for opening such a dataset with QGIS?
I tried the simplest example of opening a remote geojson file through QGIS cli and it doesn't quite work because QGIS assumes the file is a local file:
(base) jovyan@4a63b62ddbec:~$ qgis "https://raw.githubusercontent.com/datameet/maps/master/Country/india-osm.geojson"
Warning: QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-jovyan'
ERROR: Status 2: File /home/jovyan/https:/raw.githubusercontent.com/datameet/maps/master/Country/india-osm.geojson could not be found
No luck with a viscurl url either:
(base) jovyan@4a63b62ddbec:~$ qgis "/vsicurl/https://raw.githubusercontent.com/datameet/maps/master/Country/india-osm.geojson"
Warning: QStandardPaths: XDG_RUNTIME_DIR not set, defaulting to '/tmp/runtime-jovyan'
ERROR: Status 2: File /vsicurl/https:/raw.githubusercontent.com/datameet/maps/master/Country/india-osm.geojson could not be found
Not sure if I'm using the wrong syntax to invoke qgis here. Can someone more familiar with QGIS verify please?
But overall it looks like we need 2 steps to get "open a dataset in QGIS" to work:
jupyter.hub/qgis/?dataset=https://example.com/dataset
, invoke the command from step 1 with the datasetLooks like a minimal implementation of step 2 is fairly straight-forward to implement by making a QGIS specific fork of jupyter-remote-desktop-proxy. But we need to discuss a bit more about how to implement step 1. If directly opening QGIS with a remote url doesn't work then another alternative would be to try and generate a project file pointing to the remote dataset and open QGIS with the generated project file preloaded. Or we can try to automate the steps of loading the data into a new layer with PyQGIS.
@sunu great work here!
@geohacker @wildintellect do you know if there's a good way to invoke QGIS with a remote URL as parameter so that it opens that dataset when it starts up? (I know the definition of "dataset" here can get really complex, but for a proof of concept, let's start with something simple like a GeoJSON)
Seems like a bug in the command line implementation. Using that file in the Vector loader works fine:
I'll need to search for alternatives, one I can think of is to inject the layer into a template QGS/QGZ project file, and open the project instead.
india.qgz.zip
I added .zip
to the end so github would take it. QGZ is a zip file... QGS inside is an xml file.
I have a working prototype that combines a minimal project template and a pyqgis script to automate loading remote vector data files. Here's a quick demo:
https://github.com/NASA-IMPACT/veda-jupyterhub/assets/1142203/6c9ed91e-026e-4767-9e0a-c3a8dda806ab
I'll put the code in a repo once I clean things up a bit. cc @batpad
Woo hoo, great to see this in action, @sunu!
:star_struck: That's so cool, amazing work!
@sunu can you link the code? Also we should file an upstream bug/enhancement with QGIS about supporting url based data sources in the ci. https://github.com/qgis/QGIS/issues might be worth jumping on a chat with QGIS devs to figure out the best way to propose.
I cleaned up the code a bit and upload it to https://github.com/sunu/jupyter-remote-qgis-proxy/
This is a Jupyter server extension that inherits from https://github.com/jupyterhub/jupyter-remote-desktop-proxy instead of forking it. Hoping this will be easier to maintain than a fork.
@wildintellect The relevant part of the code for opening QGIS is here: https://github.com/sunu/jupyter-remote-qgis-proxy/blob/772c016b413a0faae64110d7a147bd0cfadb2a3f/jupyter_remote_qgis_proxy/qgis/utils.py#L5
And to test it out, I have a branch on my fork of nasa-qgis-image that uses this server extension. To run this locally, you can clone the repo and run the following commands:
git clone git@github.com:sunu/nasa-qgis-image.git
cd nasa-qgis-image
git checkout qgis-proxy
docker build -t qgis .
docker run -it -p 8888:8888 --security-opt seccomp=unconfined qgis
We're deploying this to the hub: https://github.com/2i2c-org/infrastructure/pull/4299
Once ^ this is deployed, you should be able to test with a link like this:
You should be prompted to login. You MUST select the QGIS image in the profile selection screen, and then hit Start Server. (we will be working to improve the automatic profile selection from URL parameters in the next quarter)
Once your QGIS container starts, it should automatically load up the dataset specified in the URL, i.e. https://raw.githubusercontent.com/flatgeobuf/flatgeobuf/master/test/data/countries.fgb . This should work for common vector formats hosted on public URLs.
(Once the QGIS container is already running, subsequently opening a new dataset with a link similar to the above link will open it in the same container and not spin up a new container).
It should be reasonably trivial to add other data formats / anything that QGIS supports once we confirm this works well.
We have this deployed on the VEDA hub! Am going to close this issue and we'll open separate issues for:
Thanks much @sunu for your amazing work on this and @yuvipanda for all the guidance and support!
At a high-level, we should have a button in the VEDA UI on the dataset page that would allow a user to "Open the dataset in QGIS" - this would take the user to a QGIS instance running inside of VEDA hub and open the selected dataset in QGIS.
Existing work on the QGIS image and setting up default data sources is here: https://github.com/2i2c-org/nasa-qgis-image/issues
Related issue about pre-loading QGIS with access to Earth Data datasets is here: https://github.com/2i2c-org/infrastructure/issues/3479
We can break-down tasks and refine our approach here.
cc @yuvipanda @geohacker