Open pnorman opened 6 years ago
Or not.
So the SQL I wrote has two calls of ST_AsBinary(way)
tegola issued this SQL to the server:
select
0::bigint as osm_id,
st_geometrytype(way) as way
from ne_ocean
where way && st_makeenvelope(-2.003750834e+07,2.003750834e+07,2.003750834e+07,-2.003750834e+07,3857)
union all
select
0::bigint as osm_id,
st_asbinary(way) as way
from ne_lake
where way && st_makeenvelope(-2.003750834e+07,2.003750834e+07,2.003750834e+07,-2.003750834e+07,3857)
LIMIT 1
Is Tegola trying to parse the SQL as text and modify it to issue a query?
Yes, it is trying to use regular expressions to do the !token!
substitution, and for some of the feature discovery, it tries to modify the SQL. It's rudimentary right now, and I have plans to implement a more robust SQL parser to enable us to be more intelligent about these transforms.
@ARolek is better with the SQL type stuff. He will have a better understanding.
Yes, it is trying to use regular expressions to do the !token! substitution, and for some of the feature discovery, it tries to modify the SQL. It's rudimentary right now, and I have plans to implement a more robust SQL parser to enable us to be more intelligent about these transforms.
My advice would be don't, because your SQL parser will not be robust. Wrapping a select statement to turn it into a subselect is safe, and that's only if you make sure to add a newline to end any comments. Almost nothing else is. !BBOX!
is only safe because it has special semantics by agreement and is unlikely to appear in the query otherwise.
This particular bug isn't related to token substitution, but type detection. I have a longer comment coming that breaks down the problems in great detail (perhaps too great).
It might be worth pairing to go over postgis.go, because there are assumptions in it that real-world (e.g. osm-carto, new WMF style, Mapbox Streets, and others) complex layer definitions violate.
Okay, I found the problem, and it's a regression since v0.3.2 when I last wrote some layers.
This is broken in a number of ways, which I'll try to explain.
The query is called on the whole world. Many queries aren't designed to be run like that if they're for higher zooms. The LIMIT 1 doesn't help later. As a common example, consider
SELECT
osm_id,
class,
st_asbinary(way) way
FROM roads
WHERE way && !BBOX!
ORDER BY class
This query would never be run at z0, but when run like this it select the entire table and orders it before returning one result.
Changing to an LIMIT 0
like Mapnik does helps but doesn't solve the problem. When you have joins, CTEs, function calls, and other more complex SQL it becomes difficult to guarantee that a large number of rows won't be fetched and then thrown away. But at least it's okay in most situations.
Investigate using POLYGON EMPTY
instead of a normal bbox.
SQL is case sensitive. Yes, in many cases it won't matter, but quoted identifiers are case sensitive. String literals will be mangled and if they're being cast, the query won't run.
There are other examples, and I'm sure there's lots I can't think of.
st_asbinary
is being replaced by st_geometrytype
.This is probably safe. Except that the query need not contain st_asbinary
. Or it could be using st_asbinary
in a different way in a subquery and it causes a parse error when you change a bytea to text.
This is the bug that was actually hit, but it could hit it a different way such as SQL like
SELECT
-- tegola requires ST_AsBinary
ST_AsBinary(way) AS way
...
The last line could have a comment on it, and if the TOML was written to not include an endline, it'll comment out the LIMIT 1
LIMIT 1
is used to look at a row.As mentioned above, this can be a problem and LIMIT 0 is better, but it still has problems. As part of reducing data volume a query could contain another LIMIT clause, and two limit clauses is invalid.
If a LIMIT must be used, wrap the SQL in a subselect so it is guaranteed to be valid. Don't forget extra newlines to get out of any comments at the end of the statement.
Geometry type is on a per-row basis, not a per-table or per-statement one. Each row could have a different type. This common with road-related layers to get ordering right where there are multiple types.
Recommendations
Change the strings.Replace call to replace everything. This should break less and fix the immediate regression.
Don't use st_geometrytype. Get rid of st_asbinary and use the type of the column
Don't make any assumptions about a layer having a geometry. They don't, unless stated by the user.
I like some of the ideas here (reading the type of the column instead of the ST_AsGeometryType). This is also related to #275
@pnorman you can work around this issue by setting the geometry_type
for the provider layer. Additional details in the docs: https://tegola.io/documentation/configuration/#provider-layers
With this config I get an error