import NAIP data into a postgres database

andrewljohnson commented 8 years ago

Putting the data in Postgres seems like a good mid-game/end-game move. Do this after we put up deeposm.org, want to scale, and/or want to provide a place for researchers to run arbitrary experiments.

Benefits include:

make rotating tiles easier (issue #24) - current pipeline could me modified too, but more hacky
maybe easier to do bounding box queries, than if data is cached in a non-relational way from NAIPs to disk
enable an API that would allow for more arbitrary training data, in less disk space

silberman commented 8 years ago

...we provide a function where you could say "give me 40,000 random tiles from within these long,lat bounding boxes, and label them using these labeller functions I want to try", and that would be fast.

(A "labeller" function being something that takes a long,lat bounding box and returns some numpy array, with simple ones that return 1:1 arrays of one of the RGBI bands, or more complex ones like the has_center_road and its various permutations, or ones that map 64x64 to 4x4 binary has-road, or has-tennis, etc)

We could still cache it if we want, (as is being discussed in https://github.com/trailbehind/DeepOSM/issues/30 ), though I think we can ditch all the NAIP-specific details, and just save to NetCDF the arrays that are going straight into tensorflow, plus some metadata about the experiment if we want.

andrewljohnson commented 8 years ago

merging with other infrastructure issues

trailbehind / DeepOSM

import NAIP data into a postgres database #23