gravitystorm / openstreetmap-carto

A general-purpose OpenStreetMap mapnik style, in CartoCSS
Other
1.53k stars 817 forks source link

Better maintainable whitelists #2452

Open nebulon42 opened 7 years ago

nebulon42 commented 7 years ago

I decided to open a separate issue for that. It is related to #2444 and an unmaintainable white list of shops was the reason that #2415 got merged.

So we need better maintainable whitelists. One idea would be to put them into the database. One then would only have to maintain the content of the tables and the SQL would be automatically adjusted. This of course opens up questions of how the rendering database gets updated with the white lists.

Basic proof of concept:

CREATE TABLE barrier_whitelist (feature TEXT PRIMARY KEY);

INSERT INTO barrier_whitelist VALUES
  ('chain'),
  ('city_wall'),
  ('embankment'),
  ('ditch'),
  ('fence'),
  ('guard_rail'),
  ('handrail'),
  ('hedge'),
  ('kerb'),
  ('retaining_wall'),
  ('wall');
        (SELECT
            way, COALESCE(historic, barrier) AS feature
          FROM (SELECT way,
            ('barrier_' || (CASE WHEN barrier IN (SELECT feature FROM barrier_whitelist) THEN barrier ELSE NULL END)) AS barrier,
            ('historic_' || (CASE WHEN historic = 'citywalls' THEN historic ELSE NULL END)) AS historic
            FROM planet_osm_line
            WHERE barrier IN (SELECT feature FROM barrier_whitelist)
              OR historic = 'citywalls'
              AND (waterway IS NULL OR waterway NOT IN ('river', 'canal', 'derelict_canal', 'stream', 'drain', 'ditch', 'wadi'))
          ) AS features
        ) AS line_barriers

In pgAdmin it works well, with Mapnik I have the problem that some initializing is done and I run into:

Postgis Plugin: ERROR:  column "way" does not exist
LINE 1: SELECT ST_SRID("way") AS srid FROM barrier_whitelist WHERE "...
                       ^
in executeQuery Full sql was: 'SELECT ST_SRID("way") AS srid FROM barrier_whitelist WHERE "way" IS NOT NULL LIMIT 1;'
  encountered during parsing of layer 'line-barriers' in Layer

I'm not sure if we can get around this.

Other ideas?

imagico commented 7 years ago

I'm not sure if we can get around this.

According to mapnik docs:

https://github.com/mapnik/mapnik/wiki/PostGIS

specifying an srid parameter for the postgis data source of the layer will avoid an extra database query for knowing the srid of the table.

Also specifying geometry_table might be intended for such cases - although the docs are not quite clear here.

If that does not help - well i guess you could always add a bogus geometry field to your table...

pnorman commented 7 years ago

Also specifying geometry_table might be intended for such cases - although the docs are not quite clear here.

Yes, if you specify the table name it will know what to query.

pnorman commented 7 years ago

Adding external SQL makes it more difficult for people to get started. It's been my experience that this can actually be a significant barrier to new people.

nebulon42 commented 7 years ago

Adding external SQL makes it more difficult for people to get started. It's been my experience that this can actually be a significant barrier to new people.

Yes, that is an important point. I'm wondering if there are better ways or if we have to live with some addition of complexity to move forward.

kocio-pl commented 6 years ago

Maybe using variables and/or storing lists in separate file(s)? That would make project.mml more readable.

kocio-pl commented 6 years ago

@nebulon42 Does CartoCSS allow using variables so they could contain such lists, so instead of:

('barrier_' || (CASE WHEN barrier IN (SELECT feature FROM barrier_whitelist) THEN barrier ELSE NULL END)) AS barrier,

we could use just:

('barrier_' || (CASE WHEN barrier IN @barrier_whitelist THEN barrier ELSE NULL END)) AS barrier,

?

matthijsmelissen commented 6 years ago

CartoCSS itself does not, but fortunately YAML supports this (see YAML anchors). I think we should replace all repeated data inthe yaml by anchors.

kocio-pl commented 6 years ago

I thought that CartoCSS preprocesses all the files. But particular solution is not important - that would be great to have such lists! My current biggest pain when editing project.mml is that... I don't have a 50" monitor and I have to scroll all of the time. :smile:

matthijsmelissen commented 6 years ago

Could you have a look if it works? YAML achors don’t look to hard to use.

kocio-pl commented 6 years ago

Sure, I'll try.

nebulon42 commented 6 years ago

@kocio-pl The statements are pure SQL and technically the YAML is only the layer specification that is then transformed to Mapnik XML. The CartoCSS part does only work in the style specification for that layer. I think YAML anchors like suggested by @matthijsmelissen are a better way forward for this than bending CartoCSS to parse the SQL.

kocio-pl commented 6 years ago

Unfortunately, using anchors is limited and the name of this feature reflects the difference:

https://stackoverflow.com/questions/4150782/use-yaml-with-variables#18877077

So one way would be to process YAML in addition to MSS to act like all the other variables, as asked, or to chop SQL queries into fields like (this might be not proper YAML code, it's just a visual example):

  list: &barrier_whitelist
    - gate
    - block 
  start_of_the query: ('barrier_' || (CASE WHEN barrier IN (SELECT feature FROM 
  whitelist: *barrier_whitelist
  rest_of_the query: ) THEN barrier ELSE NULL END)) AS barrier,

but that looks ugly and while it would be proper YAML, the SQL query would be hard to read/parse and should be concatenated anyway (and that is a simple case with just one whitelist - what to do with more of them?).

I think proposed solution is quite straightforward preprocessing: straight substituting strings with another strings in YAML, no matter if it contains SQL or anything else.

@nebulon42 What do you think about it now?

matthijsmelissen commented 6 years ago

@kocio-pl Thanks for investigating, didnt know it was that limited!

nebulon42 commented 6 years ago

It's a bit difficult. The YAML file is a layer definition. Filtering layers can be done with SQL, but there are also other types of layers: https://cartocss.readthedocs.io/en/latest/mml.html#datasource

I also wouldn't really call the substitution you suggest a variable. (Technical) people have quite clear expectations what a variable is and how it should work. And even existing variables in CartoCSS do not work that way. I also wouldn't want to extend the (at) notation to the layer file as we are changing domains here. And you have to think of a lot more: where to define them, could the value change or is it fixed once set, is overwriting possible, can you do concatenation, ... Overall my stance towards that is rather negative, but I acknowledge that this might be a valid approach to the problem.

Given the current state of CartoCSS as a language I don't see that happening. You know CartoCSS is just a one-man-show by now and it isn't my top priority. I had to invest significant resources to get 1.0 out and CartoCSS is on its best way to become legacy.