sosm / osmhistorydb-ch

ISC License
3 stars 2 forks source link

Data-Chart: Count not showing accurate data #11

Closed chnuessli closed 2 years ago

chnuessli commented 3 years ago

In the actual Potm - August Dashboard to defis there's an inaccurate count of the defi datasets. Right now, it shows 6.05k Defi's. On which data is this depending? Worldwide? It would be nice to make a chart, just for switzerland.

PS: I like superset vizualisation and your approach very much.

chnuessli commented 3 years ago

Addition: Overpass is showing 5824 Defis in Switzerland at the time of this comment

sfkeller commented 3 years ago

I can confirm that PotM https://potm.osm.ch/superset/dashboard/11/ is currently showing a count of 6.05k (= 6050), while Overpass https://osm.li/5pJ shows 5816.

The reason for this difference is that PotM is based on a full history database (see this repo) and the query we currently use for this chart currently counts all additions and does not subtract the deletions.

We have to investigate on this.

lbuchli commented 3 years ago

I've written some code which provides almost perfect results using the same full history database. At the time of writing:

You might want to add this to the wiki, but since I can't make a pull request for the wiki, I'll just paste it here:

New Code ``` SQL begin; ------------------------------------------------------------------------------- -- Cleaning -- ------------------------------------------------------------------------------- drop materialized view if exists defibrillator_changes_mv; drop materialized view if exists defbrillator_existing_mv; drop view if exists nwr_changes_v; ------------------------------------------------------------------------------- -- Data View -- ------------------------------------------------------------------------------- create view nwr_changes_v as ( select id, version, deleted, changeset_id, created, uid, tags, 'n' as type, lat, lon from nodes union all select id, version, deleted, changeset_id, created, uid, tags, 'w' as type, st_x(node.geom) as lat, st_y(node.geom) as lon from ways, lateral (select geom from nodes where id=(ways.nodes[(array_length(ways.nodes,1)+1)/2]) limit 1) as node union all select id, version, deleted, changeset_id, created, uid, tags, 'r' as type, st_x(node.geom) as lat, st_y(node.geom) as lon from relations, lateral (select geom from nodes where id=jsonb_path_query_first(members, '$ ? (@.type=="n") .ref')::bigint) as node ); ------------------------------------------------------------------------------- -- Changes Materialized View -- ------------------------------------------------------------------------------- create materialized view defibrillator_changes_mv as select distinct nwr.id, nwr.version, nwr.deleted, nwr.changeset_id, nwr.created, users.username, (nwr.tags ->> 'emergency') as potm_tag, nwr.tags, nwr.type, nwr.lat::decimal(10,7), nwr.lon::decimal(10,7) from nwr_changes_v as nwr left join users on nwr.uid = users.uid where (nwr.tags ->> 'emergency') = 'defibrillator'; ------------------------------------------------------------------------------- -- Count Materialized View -- ------------------------------------------------------------------------------- create materialized view defibrillator_existing_mv as select d.*, sum(d.nbr_added) over (order by d.day) as nbr_existing -- rolling sum of nbr_added from ( -- days: day and how many potm elements have been created minus how many have been deleted select date_trunc('day', c.change_timestamp) as day, sum(case when c.change_timestamp = c.created then 1 when c.deleted and c.change_timestamp = c.last_change then -1 else 0 end ) as nbr_added from ( -- changes: wheter the element has been potm, when the element was created, if it has been deleted, last change select nwr.id as id, nwr.type as type, nwr.created as change_timestamp, bool_or(nwr.deleted) over elements as deleted, min(nwr.created) over elements as created, max(nwr.created) over elements as last_change, bool_or((nwr.tags ->> 'emergency') = 'defibrillator') over elements as has_been_potm from nwr_changes_v as nwr group by nwr.id, nwr.type, nwr.created, nwr.tags, nwr.deleted -- make distinct window elements as ( partition by nwr.id, nwr.type ) ) as c where has_been_potm group by day ) as d; commit; ```
sfkeller commented 2 years ago

This PotM seems to work now properly: The dashboard shows the same amount as Overpass CH.