Closed njmattes closed 8 years ago
@njmattes is this table grid_data
stored at the current database ede
? I was trying to look at it.
Perhaps we could iterate over the bands. Or dump, if it is just a column...
@ricardobarroslourenco Looks like the database on the server is a couple steps behind the develop
branch. I've been testing locally today (was at a kids chess tourney all day). The netcdf_data
table on the server has a similar layout though.
I think the bands are tucked away inside the binary rast
column (by tile). Which is what's making things a bit difficult. I think we're trying to stick to the procedures in PostGIS right now, to retrieve the info from the rasters. But ST_PixelAsCentroids
doesn't actually return which band you requested. So you can't return all the bands this way, and then sniff out which ones you want.
Is it necessary to write some lower-level functionality to create a function that returns the rasters as eg EDE_PixelAsCentroids
with an integer column representing which band each centroid / value pair belongs to?
let's see...first, i'll bring the ede
db to the layout state right now, so don't use it right now. i need you to leave your db session(s) for that
Sorry about that—my sessions should be gone now.
Yeah. Me either (I've just dropped it)
we've basically got these options:
ST_PixelAsCentroids
in a for loop where in each iteration we use another band numberST_PixelAsCentroids
once to just get the coordinates + in a for loop to get only the valuesST_PixelAsCentroids
to get the coordinates + ST_DumpValues
to just get the values (for all specified bands which works since ST_DumpValues
allows us to specify multiple bands)more precisely:
SELECT val, ST_AsText(geom) FROM (SELECT (ST_PixelAsCentroids(rast, band_num)).* FROM grid_data where meta_id=1 and var_id=1) centroids;
in a for loop where band_num
runs over the band numbers / times we wantSELECT x, y, ST_AsText(geom) FROM (SELECT (ST_PixelAsCentroids(rast)).* FROM grid_data where meta_id=1 and var_id=1) centroids;
to get just the centroids and then
SELECT x, y, val FROM (SELECT (ST_PixelAsCentroids(rast, band_num)).* FROM grid_data where meta_id=1 and var_id=1) centroids;
to get the values in a for loop looping over band_num
SELECT x, y, ST_AsText(geom) FROM (SELECT (ST_PixelAsCentroids(rast)).* FROM grid_data where meta_id=9 and var_id=2) centroids;
to get just the centroids and then
SELECT (ST_DumpValues(rast)).* FROM grid_data where meta_id=9 and var_id=2;
to just get the values, but all at once (no for loop required here)i suggest we use the last one to avoid looping. of course we could wrap up the loop in a user-defined function but it's not fully clear if we can return that as a single table at the end + even if we can it's cumbersome + not as performant as the last option using ST_DumpValues
i strongly think.
speaking now of this last option, we need to know what the conventions of the output of the two queries are...
for the output
{{...},{...},{...},...}
of SELECT (ST_DumpValues(rast)).* FROM grid_data where meta_id=9 and var_id=2;
the convention is that it shows the values row-by row, i.e.
{{values of row1}, {values of row2}, {values of row3}, ...}
for the output
x | y | st_astext
----+----+----------------------
7 | 4 | POINT(8.25 63.25)
8 | 4 | POINT(8.75 63.25)
...
of SELECT x, y, ST_AsText(geom) FROM (SELECT (ST_PixelAsCentroids(rast)).* FROM grid_data where meta_id=9 and var_id=2) centroids;
the convention is that x = column
and y = row
(not the 'more intuitive' x = row
and y = column
)
understanding these conventions allows us to connect the values to the right (lat,lon)
's.
@legendOfZelda nice. I was thinking on this last option (without specifying a band) when posted before.
yes, that's what i thought you meant, cool :+1:
btw, the ede
db now has the most current schema.
Some more questions:
x = 1, y = 1
the top left corner of the data?ST_DumpValues
(always) ordered left-to-right, top-to-bottom? Otherwise there's no way to associate a 10 × 10 tile of values with a pixel. {{ NULL, ... }, .. }
entries. Isn't the default behavior of ST_DumpValues
not to return NULL
values? x
and y
from ST_PixelAsCentroids
I suppose we have to return the entire dataset.NULL
values from ST_DumpValues
do we also need to return NULL
values from ST_PixelAsCentroids
so the values align?x
and y
columns from ST_PixelAsCentroids
isn't it.The 10x10 tiles I think it is something related to an OGC compliance standard, no? When talking about the rotation order, this always comes on mind.
About the raster type, I think this goes into that discussion of data regularization that Raffaelle brought at the all hands meeting. I think this will need to be done prior to ingestion, and looking into some things such as the downscale methods that Joshua uses.
(Sorry by the weird answer, I'm typing at my phone)
just to get a more clear understanding of your questions above @njmattes:
(lat, lon, val)
, right? you don't need the indexes x, y
in the output from PostGIS into python, or do you?it took me a while to understand some of your questions above. the problem we're dealing with here (and which i didn't realize first) is that the ST_DumpValues
command above returns x, y
indexes multiple times (once for each tile), i.e. the logical coordinates x, y
are not global for the entire grid but w.r.t. individual tiles, i.e. local logical coordinates, that's why they appear multiple times.
similarly for the ST_PIxelAsCentroids
command above where the arrays returned are 10x10
, i.e. just tiles, not entire bands.
i am going to answer your questions above (with more appropriate PostGIS queries) but would be glad if you can clarify my two questions here first.
@njmattes @ricardobarroslourenco why would we want to have all timeframes of a raster inside python? why wouldn't we instead write out the raster with all its timeframes as a GeoTIFF as described here:
How to export raster from PostGIS enabled DB as GTiff with gdal_translate (it works, just tested it, see also 9.10. Raster Outputs
in the PostGIS raster reference for some output functions we probably overlooked a little) and then send that as an ordinary file to the user?
why would we ever want to hold all timeframes, which can be quite large, inside python memory?
PS: i didn't realize until now that ST_PixelAsCentroids
alone already does the job for a single timeframe. e.g. if we issue (in the current database)
select ST_AsText(geom), val from (select (ST_PixelAsCentroids(rast)).* from grid_data where meta_id=1 and var_id=1) centroids;
or
select ST_AsGeoJSON(geom), val from (select (ST_PixelAsCentroids(rast)).* from grid_data where meta_id=1 and var_id=1) centroids;
(ignore the NOTICE
and CONTEXT
messages and wait till the query has finished). we have what we need for the frontend, don't we? again, this is for a specific timeframe. if you want all timeframes that's still going to be trickier and i'm figuring it out. i'm just wondering whether we need all timeframes inside python memory in the first place instead of doing the above GeoTIFF approach.
this problem is now solved, see query (5) in #23
@ricardobarroslourenco @legendOfZelda: Is it possible to make
ST_PixelAsCentroids
return all bands from a raster? I've been playing with something like the following to return info from the database in a format that can be converted to JSON in the python middleware, so that's useful for drawing the GUI.This seems to work alright, but only gives a single band. It's possible to draw a single band, and then reissue a request (or multiple requests) for the rest of the bands.
Any thoughts about the best (and also reasonably fast) way to retrieve this? Bearing in mind that, as far as the GUI is concerned, we don't need to return the globe—only the pixels that fill the current viewport.