Open ivangayton opened 2 months ago
This ties in with https://github.com/hotosm/drone-flightplan/issues/5 as the tools we decide on to build application will influence the options available to us for flight plan generation!
If we could do this entirely in PostGIS it would be really nice, plus portable.
I have been following developments of https://github.com/electric-sql/pglite for a while - essentially Postgres in a browser via WASM (it's not a hack, it's a fully featured Postgres without requiring a VM).
They added extension support recently & currently have pgvector
working for using machine learning models on the client side.
It's only a matter of time until the PostGIS extension works too!
[!NOTE] This is a web-based solution however, so comes with all the caveats mentioned in the other issue.
I haven't had a chance to do a performance assessment, but if we are using GDAL we have lots of options available to us for bindings:
There are three options for loading the DEM I can think of:
Handle the DEM ourselves behind a web API. The user queries our API for the data we need & we return data.
Using a cloud native format, the user can sample only the required data we need from a much larger DEM:
Using a file means we need to distribute the file to users / make it downloadable. We can optimise this by storing a larger DEM, then having things like AWS lambda functions on download to process and cut out only the required data AOI requested by the user. I would say this is the least favourable.
The main consideration we need to contend with here is connectivity:
A lot of these discussions have the same answer: if we have internet access, we can optimise the approach. If we don't have internet access, we have constants to deal with and need workarounds.
If we have major concerns about connectivity in the areas people may wish to use DroneTM, then we need to accept that constraint and start making solutions that require no connectivity, even if it means reducing the user experience a bit.
As an example:
[!IMPORTANT] If go the API-based route, then we should assess performance when choosing the technology.
[!IMPORTANT] If we go the client-side route, then a pure JavaScript / WASM approach may be preferred. We should again assess performance, plus total bundle size and initialisation time for the end user. No point having fancy tech if it takes forever to load on bad internet connections or doesn't work well on underpowered phones.
I'm tempted to go for pure PostGIS. Two things holding me back a bit:
So for the moment I'll probably continue down the GDAL road.
@spwoodcock and @nrjadkry I wonder if either of you have an idea on something that's frustrating me:
I basically want every step of the flight plan generation to be capable of accepting/writing either files or objects.
When we're working on a local computer, we probably want to read and write files. When we're building a Web app, we probably want to read and write database objects (strings, blobs, etc).
In both cases, we want to be able to inspect, edit, or even inject intermediate products. When I was writing the splitting algorithm for FMTM, it was useful; for example you can inspect the building clusters, modify them, or even throw in your own clusters prior to generating the task boundaries. For DroneTM, I can imagine advanced users wanting to edit or inject intermediate products; maybe somebody wants to take extra camera angles only over a particular portion of the flight plan, or change the density of waypoints over an area with tall buildings. If we build this right, they'd be able to do so on their by modifying the autogenerated waypoint GeoJSON prior to adding the elevations for terrain following and converting into KMZ, which is much more difficult (and, in this case, dangerous) to modify.
Certainly it seems like we'd want users of the Web app to be able to visualize (and maybe edit/inject) intermediate products, just as in FMTM you can use your own manual split, data extract, or form instead of the automated one. You'd:
I'd expect some users to want to do this in the Web app, and others to potentially want to do this offline on a computer (perhaps in a QGIS plugin?).
My usual approach to this kind of thing in Python would be to write utility functions that accept objects (which are then callable from other modules), and then have helper functions that
So, for example, I'd write a utility function foo
that accepts and returns a dict, helper function bar
that parses a JSON file into a dict to pass to foo
, and baz
that writes the dict returned from foo
to an output JSON file. So far so good; same utility logic, but a external module could call foo
directly and pass it dict objects, while the __main__
function run from the command line could accept input and output filepaths and use bar
and baz
to convert. Everybody is happy.
The problem:
GDAL GeoJSON drivers don't like to write to dicts or strings. They can be induced to do it, but they don't like it. For example, in the Python-GDAL Cookbook, they suggest ways to write GeoJSON files or strings, but the file approach gives you a nice FeatureCollection, to which you can easily add a global CRS and typed fields, but the string approach just gives you individual GeoJSON geometry snippets; you have to put them together yourself into a FeatureCollection if that's what you want. I suppose we could just use GDAL (or whatever spatial library) to generate the GeoJSON geometry and then drop back to dict manipulation to write everything else. Or literally write separate functions for the file vs object workflows.
OR...
GDAL provides a Virtual File System utility. Basically all of the same GDAL drivers we'd use to deal with files can be fed from these virtual file systems, so the modules don't have to even know if they're dealing with files, in-memory objects, or even cloud resources. It has support for things like S3 buckets, which could be very helpful dealing with DEMs. I haven't dug into it yet, but I might take a look. It would definitely tie us to GDAL, which if we ever decide to shift to something else could be a hassle. What do you think?
A question for both of you: I suppose there are other ways to trick GDAL into writing to memory/objects instead of files on a disk. Maybe at the Python level, or the operating system level there are virtual filesystem hacks we could use. It seems like a nasty approach, but maybe less complicated than dealing with the GDAL virtual filesystems...
Anyway, curious if you have thoughts!
I'll give a more thorough response tomorrow!
But a quick response would be:
GDAL virtual file system is good - I would use that approach.
If we move away from GDAL at some point it's easy to manage as you describe: 1) parsers for file to memory object 2) API endpoint to memory object 3) memory object usage directly.
I personally think the Python bindings for GDAL are a bit crap. It's also possible to just use a subprocess to run the command line (although typically not advised, it's ok for this use case). If we use the virtual file driver then either approach is fine.
Based on discussion yesterday we decided:
To go for a simple approach using Python Shapely for now (we have an intern working on it)
Write standard Python wherever possible, making it easy to port to JS if needed (we also have Turf to substitute Shapely.
In the long run:
Using GEOS is probably most suitable for our requirements.
We can use it in WASM either via GeoRust or PGLite PostGIS (preferred if available)
Servers can use PostGIS
If PostGIS is used we also have GDAL available for the DEM sampling (using the COG driver)
Regarding point 2 of DEM Considerations here: https://github.com/hotosm/drone-flightplan/issues/6#issuecomment-2278094800 (about cloud optimised DEM).
Creating the DEM in a cloud optimised format is likely the way to go, making it available from API endpoints, but also (and most importantly) directly from the frontend without requiring an API call.
Two of the best options for doing this:
This format is awesome!
Here is an example: https://protomaps.github.io/PMTiles/examples/maplibre_raster_dem.html It uses a 30GB PMTiles DEM hosted online, which contains 350,000 Terrarium RGB-encoded terrain + bathymetry tiles - 10 zoom levels for the planet.
More details on the format / spec here: https://www.mapzen.com/blog/terrain-tile-service
In in long run we should probably be converting our DEMs to this format:
[!NOTE] This requires being online of course.
But the likely way around this is to have a
Go Offline
mode where the user can download two PMTiles files:
- one of optical imagery for navigation
- one of DEM data for flight plan generation.
[!IMPORTANT] Also note this method makes the DEM data available from any platform (Python, JavaScript, Mobile), without having to worry about using geospatial libraries to access the data.
We do the DEM processing once, then simply query the data with specific URL params.
This makes a ton of sense. I'll switch all of my work on DroneTM away from GDAl and toward whatever flavor of GEOS (Shapely, PostGIS, etc).
I hacked up a module to add elevation to waypoints from a GeoJSON by sampling a GeoTIFF Digital Elevation Model (pull request #4 ), and it was a bit of a bear. Not because it's inherently hard (it's not) but because I wanted to do it in a really robust way, so I
I didn't deal with using in-memory objects; what I did depends on access to a local filesystem, which is fine for local use (and in this case, since it's pure OSGEO, it would lend itself to a QGIS plugin, which might be quite a nice way to distribute a flight planning module), but it's not great for a Web app, which would normally want to deal with a database rather than a filesystem.
I'm sure that's manageable; obviously there are ways to store both GeoJSON and GeoTIFF objects as database blobs, but it got me thinking:
What's the best framework to actually generate flight plan geometry and files?
Love to hear your thoughts.