hotosm / osma-health

HOT Analytics for Health
12 stars 3 forks source link

Temporal accuracy validator #7

Closed geohacker closed 6 years ago

geohacker commented 6 years ago

@awright @smit1678 do you have a sense of what we can define as our recency benchmark? For instance, how old is too old/stale data for the malaria campaign?

We can build a generic OSMLint validator that takes a timestamp and gives out all the buildings that are older or newer than that point in time.

awright commented 6 years ago

Good question. Defer to @smit1678, but these "grades" from the CHAI presentations, might be useful. (And then somehow this became a ticket full of screenshots)

Of the 5 countries, Temporal Accuracy was only measured in 2.

Botswana

screen shot 2018-03-19 at 9 29 54 am

More Detailed Deck

screen shot 2018-03-19 at 9 39 44 am

Namibia

screen shot 2018-03-19 at 9 30 59 am

More Detailed Deck

screen shot 2018-03-19 at 9 36 38 am
smit1678 commented 6 years ago

I think temporal accuracy isn't just the timestamp for when it was added to OSM, but needs to capture somehow the temporal accuracy of the data sourced used to generate it.

Here are a couple ways I started thinking about temporal accuracy:

  1. when was the building first added to OSM and then when was the building last edited? For example, some of the data was first remotely added to Botswana in 2016, but then attribute information will be added to the buildings in 2018.
  2. if the building was remotely enumerated with satellite imagery at all. So one check would be: what was the source of the data and was imagery used?
  3. if the building was recently added but used really old imagery (say 10+ years old). This one is harder to get because only Bing and OAM report imagery dates per tile at the moment that I know of.

In short, I think timestamp is the first check (satisfies part of point 1), but then we need another check that helps further define temporal accuracy. Do we also want to add a check for if imagery was used when the building was added?

geohacker commented 6 years ago

@smit1678 @awright - thank you! This is helpful.

some of the data was first remotely added to Botswana in 2016, but then attribute information will be added to the buildings in 2018.

This is going to be really hard to do at the moment - we don't have a reliable history query infrastructure (I don't want to advocate building one more tool based on Overpass until we know what we'll do to get around the pitfalls.)

In short, I think timestamp is the first check (satisfies part of point 1), but then we need another check that helps further define temporal accuracy.

Looks like this is a good first pass. We'll be able visualise the recency of buildings in the report. We can perhaps show the satellite imagery based on the source tag, but that wouldn't have a timestamp associated.

awright commented 6 years ago

@geohacker cool. This works for me. @ascalamogna good to close the ticket and question then?

geohacker commented 6 years ago

@awright great!

Let's close after we ship the osmlint/vectortile piece of this.

kamicut commented 6 years ago

@tyrasd since this is a very similar approach to what OSMA already does, I was wondering if we could get the tiles already processed there for our target AOI and calculate a similar chart on the frontend of osma-health. What do you think?

smit1678 commented 6 years ago

@kamicut Should we be concerned with the risk of being out of sync if one is ahead of the other? Also, if the longer term goal is to expand beyond most recent edit with more history information, do we lose those goals by using the other osma tiles?

kamicut commented 6 years ago

We are going to use the vector tiles from the building filter because they contain timestamps. I'm closing this issue because we will start with edit recency @smit1678 using the timestamp and then we can create a new issue if we want to expand this.