mapbox / robosat

Semantic segmentation on aerial and satellite imagery. Extracts features such as: buildings, parking lots, roads, water, clouds
MIT License
2.02k stars 382 forks source link

rs_cover, rs_download, and rs_rasterize should accept multiple GEOJSON files #187

Open markmester opened 5 years ago

markmester commented 5 years ago

With the introduction of the batching feature extraction (#148), the inputs to rs_cover, rs_download, and rs_rasterize are likely going to be multiple GEOJSON files. It would be nice if these tools accepted multiple files as inputs. My current workflow is as follows:

CURR_DIR=$(shell pwd)

# Download full extract
download-extract:
    wget --limit-rate=1M $(PBF_DOWNLOAD_FILE) -O $(CURR_DIR)/data/extract.osm.pbf

# Cut out the area we are interested in.
cut-extract:
    docker run --rm -v $(CURR_DIR)/data:/osmium/data asymmetric/osmium-tool extract --bbox $(BBOX) data/extract.osm.pbf --output data/map.osm.pbf

# Extract geometries from an OpenStreetMap base map
extract-geometries:
    docker run -it --rm -v $(CURR_DIR)/data:/data --ipc=host --network=host mapbox/robosat:latest-cpu extract --type building /data/map.osm.pbf /data/buildings.geojson

# Generate all Slippy Map tiles which have buildings in them
GEOJSON = $(CURR_DIR)/data
generate-slippy-maptiles: $(GEOJSON)/*.geojson
    @for file in $^ ; do \
        bn=$$(basename $$file);\
        docker run -d -it --rm -v $(CURR_DIR)/data:/data --ipc=host --network=host mapbox/robosat:latest-cpu cover --zoom $(ZOOM) /data/$$bn /data/$$bn.tiles; \
    done
daniel-j-h commented 5 years ago

rs rasterize already handles batch rasterization into the same tiles:

https://github.com/mapbox/robosat/blob/97f83fb0e9ec2ff91754282093ed6d238a2e433c/robosat/tools/rasterize.py#L131-L133

You can run rs rasterize multiple times and if there is a rasterized tile already present (e.g. previous batch was rasterizing a building on the left, this tile wants to rasterize a building on the right in the same tile) then we load it up and rasterize the current batch onto it.

This allows for simple and easy batch rasterization with a single rs rasterize command.

For context see https://github.com/mapbox/robosat/issues/25


The other tools you can call in a bash for loop; not sure if we should explicitly complicate the logic here. But I understand it's no longer as easy as running multiple commands after each other.

If you want to implement this, I'm happy to guide you along and review pull requests.