GIScience / ohsome-now-stats-service

This is the REST service for the ohsomeNow stats.
https://stats.now.ohsome.org/api/
GNU Affero General Public License v3.0
4 stars 1 forks source link

avoid counting contributions multiple times when using * wildcard filter #30

Open Hagellach37 opened 11 months ago

Hagellach37 commented 11 months ago

contributions could be counted more than once with the current implementation of the /stats endpoint when the wildcard * filter is used.

When using the wildcard filter we could first group values by contribution_id before deriving the sum(buildings) or sum(road_length_delta).

https://github.com/GIScience/ohsome-now-stats-service/blob/ba1bedb773e672437426c2a8bca001e14f54329b/src/main/kotlin/org/heigit/ohsome/stats/StatsRepo.kt#L27-L40

ElJocho commented 11 months ago

Thought collection on this topic:

I investigated further and tried to create a aggregated projection like so:

 ALTER TABLE stats ADD PROJECTION combined_hashtag (
    SELECT 
        min(changeset_id) as changeset_id,
        user_id,
        min(road_length_delta) as road_length_delta,
        min(building_edit) as building_edit,
        min(map_feature_edit) as map_feature_edit,
        changeset_timestamp,
        groupArray(hashtag) as hashtags,
        min(country_iso_a3) as country_iso_a3
    GROUP BY 
        changeset_timestamp, 
        user_id,
        osm_id
)
;

Sadly there seems to be no way to access the newly defined column "hashtags" so I cannot query for arrayExists(hashtag -> startsWith(hashtag, 'm'), hashtags), rendering it useless.

Conclusion