Open knutole opened 8 years ago
Hi @strk, welcome to first day of chaos! :)
I've added a label strk
to some of the issues I'd like you to help out with. Feel free to look around at everything else also, of course. I'm a bit unsure as to how this will go forward, so I think it's best we just jump in and see how it goes. I guess you'll do the things you think is most complicated, and whatever you think we can manage, you let us know. Since you'll be with us only two days a week, we can do work based on your recommendations on the other days.
Also, if you have any input on how best to manage a work-process like this, don't hesitate to bring it forward.
I guess the first days will go to getting familiar with what's going on in the code, etc. Also, there is some setup to do:
/var/www/wu/
, /var/www/pile/
and /docks/
.rmate
for changing files over ssh. This means that we simply do rsub api.geo.js
and it pops up in our Sublime. I'm not sure what kind of setup your prefer, but can talk about it tomorrow. /var/www/wu/
and /var/www/wu/public/
.pile
) is located in /var/www/pile/
/docks/
, where /docks/dev/
contains the "run-script", and /docks/build/
contains the different build-files for the containers.I think you mentioned you haven't worked with Docker before. It's very simple however: Docker images are built from a recipe, aka a Dockerfile
. Docker containers are instances of such images. For example, I've made a Docker image called systemapic/ubuntu
- which is simply a stripped Ubuntu image. If I want to run this image and do something with it, I have to create a container based on the image. The container is ephemeral, when I'm done with it, it can simply be deleted, and so on.
Anyway, our setup consists of 13 containers, some of which are only storage containers. These are run together, and for doing so there is docker-compose
. Docker-compose simply reads a docker-compose.yml
file, listing the different containers that are being used, how they link to each other, which ports they have open, which image they're based on, and so forth. This is our docker-compose.yml file.
./restart.sh
in /docks/dev/
. That will shut down all containers, flush them, restart them, and put you into the live log from all containers. This is safe to do at any time./docks/build/postgis/
and run docker build -t strk/postgis .
, which will build and image called strk/postgis
based on the Dockerfile in the current folder. Or you can do ./build.sh
in each folder, as a shortcut. Note, however, that the names given in the build.sh scripts are the one's we're using, so if you overwrite the names, we'll lose the original image. So better if you use strk/
prefix during debugging. The Dockerfile decides what's built, obviously.docker images
, to list running containers docker ps
, to list all containers (incl. stopped ones) docker ps -a
, to delete containers docker rm CONTAINER_ID
. Container ID can be found with docker ps
, to delete images docker rmi IMAGE_ID
.docker-compose.yml
file in /docks/dev/
for how containers are run together./var/www/wu
for example (on the host) is the code that's being run inside the container. This is for development, makes it easier to change the files. We're using nodemon, and it works well. So you don't have to go inside the container to change the code in /var/www/wu
and /var/www/pile
.I think that should cover day one. Talk soon, ciao.
Dense lecture, thanks for the 101 (maybe useful to put turn it in a wiki page or a file in the docker-systemapic repo). I guess it'd be worth learning to build and run those dockers on my local machine, don't you think ? I'll be trying that while waiting for those credentials to get to me (PS: for a secure channel you can lookup my pgp key on common keyservers, Key fingerprint = 459E B3A5 E7C5 2ADE 3F3F 68A2 D6C0 7DA4 AC56 2DAD)
Hi Sandro, here's a overview of our plans ahead, and areas where we'd like some help:
Our vision
Our stack
PostGIS Roadmap for Q1 2016
Best practices: We need to make sure our PostGIS backend setup makes sense and is scalable and secure. We're currently creating a new database for each user, and putting all their datasets in separate tables. We'd like to make sure that rendering speeds are as fast as possible, that loads are distributed, that everything in short is optimized. Also, we need some way to control disk usage per user, and I saw there were some nice sql-scripts you've done for CartoDB.
Security: Harden our PostGIS setup and lock it down completely.
Docker: PostGIS is currently in a separate container in our Docker Compose setup. We want to be able to query any PostGIS container from any tile-server container (which should be easy, simply an extra address point), and some automated backup of data to make our containers fault-tolerant. Our entire setup needs to be scalable, and it's simply a matter of adding more empty PostGIS containers to the swarm to scale.
Rasters: We are doing a large raster project in Q1 with a client, where we'll implement support for rasters and calculations on rasters. The rasters in question are low-res satellite images of snow-coverage, and we've found that vectorizing them and doing the calculations with vectors work well for this datatype. However, it's rather avoiding the problem of implementing real raster support in PostGIS, and we'd like to look at the possiblities of doing rasters the right way (if there is such a thing) in PostGIS. We're also handling more convetional rasters, ie. as overlays, and currently we're simply tiling them up with a python scripts. Possibilites for creating tiles on the fly from PostGIS for rasters would be interesting to look at also.
Operations: 1) We'd like to implement the possiblity of cutting rasters on-the-fly, ie. drawing a polygon in the client (browser) and intersecting the raster to the polygon, cutting/cropping the raster. 2) Same for vectors. 3) Look at what other operations we can do on rasters and vectors, and implement a list of SQL scripts that is easily pluggable and expandable. Also, script hooks on import, etc.
Client-side table: We'd like to implement a client-side table where user can view and interact with a PostGIS table.
SQL API: I guess all this is best set up as an API for PostGIS. Should include creating of new tables, importing of a list of formats, import from other databases (ArcGIS, Oracle, PostGIS)
Vector tiles: We don't currently have full support for vector tiles from PostGIS (only some proof-of-concept), but we need this implemented asap. I'll be implementing support for vector tiles client side. We need to look at simplification/clustering of polygons, points etc. for vector tiles. This will go in our tileserver
pile
.Optimizations: We want to look at parallel PostGIS processing, https://github.com/gbb/ppppt, parallel shp2pgsql and twkb format. Also, tile-creation from large datasets, large imports, etc. We're generally dealing with larger amounts of data than eg. CartoDB, and need to make sure we're as optimized as possible to reduce server-load and load-times.
Mapnik:
#layer::pseudo
styling. We need to impement thepseudo
styling possiblities in our tile-server, making it possible to style several separate layers in the same bulk. Hopefully you have experience with this from Windshaft. Our current tileserver is here https://github.com/systemapic/pile, also run in a separate Docker container.Projections. We need to be able to fetch data from PostGIS in a variety of projections (API)
Way forward
These are the things we'd like to look at in Q1. Obviously there are things we have not thought about, and we're looking forward to getting your input on every aspect of the setup. We'll see how fast time goes, but I'm thinking we could work alongside each other, we work on some things and you on others, and we make sure all is done in accordance with best practices and your guidance.
This is as much as I can say right now, I think. We need to discuss and get feedback on the road ahead. Please let me know how all this sounds to you, and please feel free to use Issues and the repos as you see fit.
Repositories
A quick guide to repos: