healthysustainablecities / global-indicators

An open-source tool for calculating spatial indicators for healthy, sustainable cities worldwide using open or custom data.
MIT License
85 stars 36 forks source link

Finishing validation #38

Closed gboeing closed 4 years ago

gboeing commented 4 years ago

Opening an issue to document ideas. What can we do to make the validation most helpful for the policy team prior to hand-off?

It'd be helpful if everyone could peruse the current /validation folder and add comments here regarding:

for Nick and David's work this summer. I'll discuss with them and come up with a workplan.

cc @shiqin-liu @carlhiggs @duelran @nicholas-david @Dmoctezuma80

carlhiggs commented 4 years ago

@gboeing @shiqin-liu @duelran @nicholas-david @Dmoctezuma80

Great work preparing the validation resources and analysis Thu, Nicholas and David,

Here are my preliminary thoughts,

1) Ensure we make fair comparison with regard to study region extent --- we should include the urban study region boundary as a reference area in the maps we display

For Belfast

  1. Belfast: Figure 2 shows all the freshfood related pois in Olomouc from OSM (in yellow) and from official dataset (in red). The official layer is overlayed with the OSM layer.

Figure 2: Freshfood related pois in Olomouc

image

Consideration of our study regions is key in all of this. They also help with a visual making sense of the area (dots on maps are quite overwhelming without other reference points!). So I think we should as a rule include their outline in maps.

2) If possible, it would be interesting to run comparative analysis of access for urban sample points using the OSM and official datasets, as far as this would be representative. For example, if a particular official data set doesn't have full study region coverage, we would only seek to compare access for those sample points located 500m internal to the extent of the official dataset's coverage, whilst noting coverage limitation of official data. Any conceptual mismatch would have to be noted too (ie. our destination categories are quite specific, and official data may not be a perfect representation of what we are trying to measure, so and this limitation should be noted as a strength and rationale for sourcing OSM data as standardised set of data across all cities, albeit imperfect).

Having said this, the approach using buffers is interesting and somewhat accounts for what I describe above; if we ensure the official data are delimited to the appropriate region, I think comparisons will be more fair.

3) The discussion of street markets is good; in addition to noting that OSM may not capture these if they are temporally contingent / periodic, it will be worth noting that markets which are impermanent may be beyond the scope of our analysis which assumes the amenities are accessible daily (which may be a limitation that we need to discuss, as farmer's markets are a valid place to shop -- but these are beyond scope and not captured in our analysis; a number of collaborators, and we can assume reviewers, questioned their omission so we have to defend this - I think we can both conceptually, and data-pragmatically).

4) Accessible colours and legends for maps In addition to including study region boundary overlay on each map, it may be good to use grayscale basemap and ensure overlay points are colourblind safe.

Also, in below image, it is not sure if the comment on restaurants relates to the image above or below --- if there were a legend on the maps referring to the type of destination represented, this would be resolved. But also, I think maybe that sentence is meant to come under the heading which follows it?

image

Would be good to distinguish which colour refers to OSM-derived data and which refers to official data on the map legend too, as well as in text.

5) Could be good to not refer to the OSM-derived data as "OSM data" or "OSM points" as technically it is a derived extract using a combination of key-value pairs.

A reason why this could be problematic is, for example this sentence "We looked into the visible red dots to figure out what they are and why OSM is missing them." --- for example, OSM itself may not be missing those points itself, but rather our OSM-derived dataset is missing them. This could be due to a range of reasons, including lack of any information in OSM at the point in time the dump was retrieved, but also due to the way we interpret the dump and seek to glean clues from the OSM data in the dump.

The text does explain that differences relate to the specific definition we have curated for our destinations, but I think we just need to be consistently clear terminology-wise that what we are comparing is not OSM per se, but a dataset of POIs derived from OSM for the purposes of our study.

Incidentally, the discussion of Fresh Food datasets in Olomouc (Figure 8) is a bit confusing in that it is explained

Figure 8 shows all the freshfood related pois in Olomouc from OSM (in green) and from official dataset (in red)

but then clarified that

Most of the pois from the official dataset, which are not available in the OSM one are not directly related to freshfood access as defined in our study. There are barber store, department store, studio, safe and vault shop, furniture store, and some restaurants among other things.

If the latter point is true, I think we need to clarify that we are not comparing our fresh food points with official fresh food points, but rather, in the absence of such a dataset we are making a comparison with a retail dataset. Then the subsequent explanation would make more sense (as reading this it makes me think, if this is what is included, why would we expect this to be a valid comparison? However, in the absence of a fresh food dataset, we could say we've used a retail dataset as a proxy, but then argue that it is an imperfect proxy and hence the use of OSM derived fresh food layer is preferable).

.

carlhiggs commented 4 years ago

Regarding the edge validation, I think the comparisons made are good. My thoughts on extending this are

1) Comments from above re: grayscale basemap, accessible colours, and legends distinguishing comparison groups (OSM-derived and official network) apply (e.g. in the case study snapshots)

2) Can we confirm that the case studies are conducted with the all roads dataset, not the pedestrian dataset?

3) Related to the above point, it is great pointing out the exceptions with private areas and industrial areas, however, I think we need to note that in the pedestrian network used in our main analysis such network segments are purposively omitted, as we have aimed to curate an OSM-derived network of publicly accessible paths and our query explicitly excluded private and inaccessible paths. This is important as not only are these 'gaps of less importance to our study' but more than this, these gaps are important to our study and such paths are intended to be deliberately excluded.

carlhiggs commented 4 years ago

Regarding the representation of private access ways, I believe these may be systematically excluded even in the OSM all roads network type, hence explaining why in the Belfast case studies the private service roads and industrial roads are not represented.

I was a bit curious, so found the 'castle' example on OSM,: image

In the above example it was noted that the green paths in the official data were not present in our OSM-derived dataset (which I expect is the all roads dataset; I confirmed they are present in neither that nor pedestrian dataset).

In the below image however we actually see that some paths are represented in the grounds on OSM as service roads, and which are included in neither our pedestrian nor all roads dataset (of course, for our analysis purposes, we do desire that these be excluded, so this is not a problem!). In fact, it appears the representation of paths is more detailed on OSM than in the official data (which is not surprising, for such informal paths).

image https://www.openstreetmap.org/query?lat=54.6011&lon=-5.8503

I believe this behaviour, which makes sense from an access-analytical perspective may relate to the default access settings within OSMnx https://github.com/gboeing/osmnx/blob/e05ccb77860643a806e5c512b83ac440abad7dda/osmnx/settings.py#L72 image

I think this makes sense both as a default setting and an explanation for why these service roads are missing in both the all roads and pedestrian edges datasets.

If so, this will be an important point to note as it not only justifies our valid use of the OSMnx derived network for accessibility analysis, but also explains why the match with the official network is not (and should not be) 1:1

It also highlights the importance of us maintaining a consistent distinction between what is on OSM, and our use of a derived and distinct dataset constructed using OSMnx.

carlhiggs commented 4 years ago

Just to provide more confirmation on the above, the service roads in the above image are tagged as 'access=private' so it does make sense they were excluded as per the OSMnx default behaviour (even on all roads network type): image https://www.openstreetmap.org/way/35905654

gboeing commented 4 years ago

@carlhiggs I agree that this is the right approach. FWIW there's another predefined OSMnx network_type "all_private" that includes private ways too. But I believe they should be excluded in any analysis of public accessibility because they are definitionally not public.

shiqin-liu commented 4 years ago

Thanks for all the great work! I think Carl's comment is already really comprehensive. I do not have much to add on the edge validation. For POI validation, I agree with Carl's point on "making fair comparison" and "running comparative analysis" between official and OSM derived datasets.

As Thu pointed out in her summary, one major problem of POIs validation is that we are comparing different definitions of fresh food destinations between OSM and the official dataset. One obvious observation is the restaurant categories in the official dataset, which are not included in our work to define fresh food destinations in OSM. So perhaps, while making the comparison, we can clean up the official data a bit to include only the fresh food categories we use, so as to make them more “comparable” to a certain extent. But this may be too intensive and not worth doing, I would agree on carl's point noting this as a limitation of the official dataset and OSM is preferred.

At this stage, I think we knew that both OSM POIs datasets (and official datasets) are quite “messy” and “problematic” at some point, especially when it comes to point-level comparison. But ultimately, for our project, we are concerned about how much of the messiness or mismatch is going to affect the accessibility analysis. It comes to me our early discussion regarding the accessibility methodology. We are not counting the number of destinations or diversity of the destinations to produce access indicators. What we are currently counting is if a sample point neighborhood has at least one fresh food destination, then we coded it as accessible. My assumption is that this method might reduce some inherent problems of OSM POIs regarding accessibility work (as we noted visually that the coverage of OSM POIs around urban center is pretty good). I think in order to test this, one thing we could do is to reproduce the sample-point or hex-level accessibility indicators using the official dataset, and then compare with the indicators produced using the OSM POIs data. Same idea as running the comparative analysis Carl has pointed out. I have not implemented this myself, but I think it should be achievable by replacing the OSM destinations' input with the official ones, and implement using our current workflow.

duelran commented 4 years ago

Following on from other's feedback (and thanks @carlhiggs for going the extra mile with this), I'll add three points that I think are worthwhile considering.

1. As comparable as possible

Wherever possible, we should aim to ensure that we are conducting an “apples with apples” comparison. Whenever we display a map, there is an implicit assumption that this has been done already by the map creators so we must make every effort to do so. For POIs, this means trying to find attributes that align as closely as possible between the represented datasets. When we display the eventual, inevitable differences we need to highlight to what extent these differ because of (i) differences in the real world, and (ii) differences in the attribution. We are more interested in the former, less interested in the latter, but can’t get to (i) without quantifying (ii). Here spatial heterogeneity is our friend…a homogenous mismatch suggests that we are dealing with (ii); a heterogenous mismatch suggests we are dealing with (i). Thus I would strongly suggest that we look to implement spatial summary statistics that summarise both to characterise the extent to which we are seeing (i), (ii) and both.

On the point of comparing polygonal data with point data, we need to be very careful about the former in two senses; if it isn’t attributed that strongly suggests that we are looking at presence/absence data only. That means that we need to compare an area of presence/absence with point occurrence data. If it is attributed, that means we need to compare a count within an area with point occurrence data. Both are possible…we just need to know (or guess) which method is appropriate. The second point is that large areas are probably less interesting but are visually dominant on a map. Taking Figure 3 (PoIs) as an example, it looks like we have large areas where the data is misaligned, but by virtue of being large and on the fringe (and with no other data or analysis) I would guess that these are the least densely populated areas and probably the least likely to have PoIs. I’m more concerned with smaller areas where there aren’t PoIs, but by their very nature these are harder to see on the map. Smaller areas means denser population means bigger sample means better insights…we just need to make sure we focus in on these rather than lose them when we scale to a city as a whole.

2. Care with scale

I think Table 2 (PoIs) and the corresponding figures are interesting, but I’m not sure what the utility is of looking at such a broad range of nearest neighbour distances. My starting point would be to try and match paired locations in both data, rather than just nearest neighbours. For example, “Costco Smithstown” might be a supermarket in both official and OSM data (where it appears as “Smithstown Megamall Costco” as an example). The differences between all the Costcos in both the datasets range between 5m and 25m. That suggests that there are some positional accuracy issues and/or discrepancies in how people have chosen to represent what is in practice a polygon as a point (do I pick the centroid of the store or the entrance)? However if we pick something like a convenience store, what are the chances that someone would record this 100m from where it is actually located? I suggest close to 0%. What are the chances it is near another PoI within 100m? I’d suggest close to 100%. My approach therefore would be to draw from the data to inform a best choice of scales (noting that different types of PoIs may need different scales to be used) and then analyse against these.

3. Considering points and networks together

In practice almost all of our indicators combine point data and network data together. By separating these in our validation analysis we are potentially losing something. In practice, our population weighting means that (roughly speaking):

Indicator quality = population density x network quality x PoI quality

We could spend a lot of time chasing quality issues in areas that won’t skew our overall results. Coming back to an earlier comment, I worry about the smaller areas more as the population density is generally so high in these (multiplying any errors in either network or PoIs). In general, I think all three of the variables on the right tend to be positively correlated (especially in crowdsourced data); the more people live in an area, the more ability and incentive there is to get the data right. Our job is to find the scenarios and circumstances under which this relationship breaks down (e.g. no active OSM contributors in a densely populated but more socioeconomically disadvantaged area of a city) and identify/mitigate the impact this has on indicator calculation.

gboeing commented 4 years ago

Thanks @duelran @shiqin-liu @carlhiggs. Your thoughtful comments are much appreciated and line up with a lot of my own suggestions for moving forward here. I'm going to compile a short summary of bullet points and ideas here for validation work over the coming month:

  1. Clean up OSM/official data terminology. I propose something like "our destination dataset" to mean "a dataset of points of interest downloaded from OpenStreetMap for this study" and "the official destinations" to mean "a dataset of official points of interest provided by a local city partner."

  2. For all validation work, all data to be validated between the two datasets must always be truncated to the study site first. We are not interested in anything beyond our study area's boundaries.

  3. Alternative validation may use a -500 meter buffer into the official destinations to examine correspondence with our destination dataset within the heart of the official dataset's coverage area. This could serve as a robustness check.

  4. For better apples-to-apples comparison, we should compare similar destinations in the analysis stage (e.g., remove "restaurants" from the official data if they, by design, are not present in our destination dataset) rather than just writing up at the end that this could be an explanation of the wide discrepancies.

  5. All mapping and visuals should be simplified, use color theory best-practices, and (for final versions) include essential map elements such as legend, scale bar, and compass rose. We do not need basemap layers. Instead, we should simply plot, e.g., a black background, the study area polygon (white) and possibly the network edges (gray) to visualize internal structure. Points can be scatter plotted on top using accessible color palettes.

Finally, many of the point/polygon and nearest-neighbor comments revolve around a fundamental issue: we are trying to validate presence/absence of what may inconsistently be a single point or a cluster of points within inconsistent proximity of each other. For our accessibility indicators, we don't care if you can reach just 1 fresh food destination or 1,000 fresh food destinations. Similarly, the validation work shouldn't care if there's a single point representing a market hall or 100 points representing vendor stalls within 100 meters of that point.

Accordingly, the validation should use some form of clustering or spatial aggregation to smooth this analysis. I'd propose two methods as robustness checks against each other:

  1. Take our accessibility analysis sample points and draw buffers around them to produce reference polygons. Then do point-in-polygon intersection with both our destination dataset and the official destinations. Then compare the resulting True/False vectors from each to see how they align.
  2. Do the same thing, but using hexagons.

This lets us see if our destinations' spatial distribution aligns with that of the official destinations - but only in terms of presence/absence of a single such point in each spatial bin. We should additionally weight these vectors by population and by intersection density to get at the "dense areas" bias @duelran raised above.

DRosasMoctezuma commented 4 years ago

Investigating the Completeness and Omission Roads of OpenStreetMap Data in Hubei, China by Comparing with Street Map and Street View https://arxiv.org/ftp/arxiv/papers/1909/1909.04323.pdf This study investigates the completes of OpenStreetMap (“OSM”) road datasets in Hubei, China. Using both Street Map and Street View, they focus on determining OSM road completeness and omission roads. An omitted road is classified into one of three types: public roads; private road; and roads for non-motorized vehicles. The study employs an approach proposed by Zhou and Tian. This approach includes the use of geometric indicators to estimate the quantitative completeness of street blocks in OSM. The authors analyze the completeness of street blocks in an OSM dataset by comparing them with a reference map. The method extracts OSM road datasets and converts them into a number of street blocks that are then visually compared with the Baidu Street Map. The analysis of omission roads is then determined by randomly selecting 60 incomplete street block from the OSM road dataset, overlapping it with the corresponding Baidu Street Map, and then manually digitizing all the omission roads in each of these street block. Most of the omitted roads were private roads, or one single lane, public roads, of lower importance within the urban road network. For 13 out of the 16 prefecture-level divisions, street block completeness values were lower than 40%, and the maximum value was only 55%. However, in roads with traffic conditions, 14 out of the 16 prefectures, street block completeness values were higher than 80%. This indicates that major roads have been properly mapped. The results also indicate that in terms of road length, approximately 90% of omission roads were either public roads or one lane private roads, of which no more than 10% were for non-motorized vehicles.

Extending Processing Toolbox for Assessing the Logical Consistency of OpenStreetMap Data https://onlinelibrary.wiley.com/doi/epdf/10.1111/tgis.12587 This study extends the capabilities of the Quantum GIS processing toolbox for assessment of spatial data by addressing the research gap regarding insufficient established methods to assess the quality of the OSM data. Two types of representation of road networks are used: the first is primal that defines a two-dimensional graph where edges intersect only at nodes, and the second is a dual presentation, where the dual graph represents roads as nodes and intersections as edges. The model developed analysis of the topological errors and corrections using the following steps: layer re-projection to convert the layer into UTM; removes micro-segments using a threshold value of 1m, the vertices are pruned if the topology is maintained; removes dangles, using a threshold of 3m; performs line features snapping to a vertex a threshold of 3m; removes duplicate geometry features; removes lines features of zero length; and for any intersections, it validates closed holes and fixes node ordering. The research adds functionality to convert shapefile data of the road network into a multidigraph representation. The results conclude that even the proprietary road data sets are not free from logical inconsistencies and data contributed by the general public is credible. The study developed models and scripts to assess logical consistency based on three components: geographical topological consistency; semantic information (tags); and structural topological consistency or morphological consistency. Developing easy-to-use workflow models to assess OSM data. Quality Assessment of the French OpenStreetMap Dataset https://onlinelibrary.wiley.com/doi/epdf/10.1111/j.1467-9671.2010.01203.x The article studies the quality of French OSM data. Extending previous work (Haklay), it provides a larger set of spatial data quality element assessments, and uses different methods of quality control. Comparisons were made between the OSM data and BD TOPO Large Scale Referential (RGE) data (reference datasets with a metric resolution) to assess: geometric accuracy; attribute accuracy; completeness; logical consistency; semantic accuracy; temporal accuracy; lineage; and usage. The results raise questions such as the heterogeneity of processes; scales of production; and the compliance to standardized and accepted specifications limiting the possible applications. Finding a balance between specifications and contributor freedom is raised, proposing new research such as contributors’ assistance with an automatic checking of contributions. The world’s User-generated Road Map is more Than 80% Complete https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0180698&type=printable The study utilizes two complementary, independent methods to assess the completeness of OSM road data in each country in the world. In order to assess the completeness of OSM data, a visual comparison with aerial imagery, and fitting parametric models to the historical growth of the OSM street network. After obtaining estimates of completeness, the length of road network in each country is obtained, through dividing the existing length of mapped roads in OSM by the estimated fraction complete. The visual assessment is based on a stratified and probability-weighted sample of 45 points in each country. The sampling algorithm in QGIS, selects a random point and overlays street in the OSM database against aerial or satellite imagery provided by Google through the OpenLayers plugin, at a scale of 1:5000. The model also provides estimates of the number of road edges, which are then used to weight each grid cell when aggregating the grid-cell fraction complete predictions to the country level. Globally, OSM is 83% complete, and more than 40% of countries have fully mapped street networks. The most notable finding is that completeness has a U-shaped relationship with density. Inter-urbans roads that traverse areas with minimal population are largely present in OSM, as well as high-density areas with a large amount of contributors. Communities that are most likely to have missing streets are smaller towns and villages. A Comprehensive Framework for Intrinsic OpenStreetMap Quality Analysis https://onlinelibrary.wiley.com/doi/epdf/10.1111/tgis.12073 The investigation presents a framework containing more than 25 methods and indicators is presented, allowing OSM quality assessments based solely on the data’s history (OSM-Full-History-Dump). In lieu of a reference dataset, approximate statements on OSM data quality are possible. No ground truth reference dataset is deployed for OSM data quality evaluation. For example, an intrinsic analysis approach is applied in specified areas within OSM that can be evaluated by investigating the data’s historical development, the comparison of features’ characteristics at different timestamps. In order to evaluate the OSM data, the calculated results of the iOSMAnalyzer are divided into the following categories: fitness for purpose; general information on the study are; routing and navigation; geocoding; points of interest-search; map-applications; and user information and behavior. The calculated results give a compact quality overview of a freely selectable area. Quality depends on the individual use case and the OSM data is evaluated in terms of fitness for purpose. However, absolute statements on data quality are only possible with a high quality reference dataset as a basis for comparison. The study revealed that the interpretation of quality indicators is facilitated and supported by means of contributor activity.

DRosasMoctezuma commented 4 years ago

Imported comment by @carlhiggs from #47 There's a book on OpenStreetMap in GIScience from 2015 which contains a number of articles which may be relevent e.g. re data quality, inference of land use, and use in network routing: https://link.springer.com/book/10.1007%2F978-3-319-14280-7 https://doi.org/10.1007/978-3-319-14280-7

This article looks at OSM validity with regard to neighbourhood socio-economic clustering (poorer neighbourhoods surrounded by other poor neighbourhoods had poorer representation of licensed liquor outlets in OSM than poorer neighbourhoods surrounded by wealthier neighbourhoods) Bright-2017-Geodemographic biases in crowdsour.pdf

Bright J, De Sabbata S, Lee S. 2017. Geodemographic biases in crowdsourced knowledge websites: Do neighbours fill in the blanks? GeoJournal 83:427-440.

However, its worth noting in the above article that they considered counts which may present a more pessimistic representation of validity of OSM data when used in distance to closest analysis. That is, even if in the real world POIs are clustered together but the count of these is under-represented in OSM, you only need at least one point in the approximate area to return an approximately accurate distance to closest estimate. This hypothesis is linked to nature of spatial data, that like things cluster together -- so if for example we have one supermarket/restaurant/PT stop in our OSM data, we might actually expect more than one to be in the real world, as such things often cluster together (that is, if you were to bet where you would find some kind of amenity, you'd probably pick the place where other like amenities are located, not the place where you know none are --- kind of follow on implication from Tobler's first law of geography, if near things are more related than distant things, you'd expect to find more related things close to those things you are aware of even if the data you have doesn't represent them - particularly if you suspect your data to be indicative, but incomplete like OSM).

This is why I think it would be worth us estimating the empirical difference between use of OSM and official datasets in addition to comparing the counts, as since we aren't measuring counts within distance but only distance to closest ---- we only need one valid point not multiple ones co-located to get an approximately right estimate of distance to closest. (I suggested this somewhat earlier in the project, but not sure if we've looked into it to date).

There are also chapters on use and quality assessment of voluntary geographic information in the book 'Mapping and the Citizen Sensor, downloadable from ubiquity press https://www.ubiquitypress.com/site/books/10.5334/bbf/

DRosasMoctezuma commented 4 years ago

How good is volunteered geographical information? A comparative study of OpenStreetMap and Ordnance Survey datasets https://journals.sagepub.com/doi/pdf/10.1068/b35097 The study analysis focuses on London and England, since OSM started in London in August 2004. The analysis focuses on analysis of its quality through a comparison with Ordnance Survey (OS) datasets. An evaluation of the positional and attribute accuracy, completeness, and consistency provided an early indication about the quality of VGI. For this study, two elements of the possible range of quality were measured: positional accuracy and completeness. For the purpose of the comparison, only streets and roads will be used. These are the main feature that is collected by OSM volunteers. Junctions are collapsed to single nodes and multi-carriageways to single links. High-resolution mapping (1:1250 urban areas, 1:2500 in rural areas, 1:10 000 in moorland), avoiding minor roads and cul-de-sacs. A grid at a resolution of 1 km was created across England. The rest of the analysis was carried out through SQL queries, which added up the length of lines that were contained in or intersected the grid cells. The results indicate that OSM information is fairly accurate: on average within about 6 m of the position recorded by the OS, and approximately 80% overlap of motorway objects. The analysis presented several aspects of the OSM dataset, an impressive feat was the speed at which the dataset was collected. The matrix of positional accuracy shows that OSM information is of a reasonable accuracy of about 6 m. At the same time, it exposed highlights in the inconsistency of VGI in terms of its quality. Differences in digitization, some areas are consistent and carefully registered while others are poorly collected. It also exposes the implications of the digital and social divide on VGI, a lack of coverage in rural areas and poorer areas. Assessing OSM Road Positional Quality with Authoritative Data https://re.public.polimi.it/retrieve/handle/11311/985111/70571/Antunes_Fonte_Brovelli_Minghini_Molinari_Mooney_2015.pdf The aim of this study is to assess the positional differences between the road network available in OSM for some regions of the Coimbra Municipality, Portugal, and the data provided by the Coimbra City Hall. A preparation of both OSM and REF is conducted and several measures are computed. A subset of the original OSM is extracted so that its line features have a direct correspondent in REF, discrepancies are then removed. This is achieved by applying a buffer around REF but also through a comparison between the angular coefficients of REF and OSM line features. Finally, by returning, per cell the length and length percentage of OSM having a deviation smaller than a user-specified threshold value, and the maximum deviation between OSM and REF datasets. The results based on distances between OSM and REF, within a pre-defined radius, deliver a simple and fast way of comparison with an authoritative dataset. However, the second workflow based on OSM road lengths that are included in a pre-defined buffer, is more complete, robust, and customizable, although it requires increased computational time. Quality Evaluation of VGI Using Authoritative Data—A Comparison with Land Use Data in Southern Germany https://doi.org/10.3390/ijgi4031657 The study investigates accuracy of VGI, derived from the OSM dataset. It focuses on two spatial data quality elements: thematic accuracy and completeness area addressed by comparing the OSM data with an authoritative German reference dataset. The study area is the Rhine-Neckar region, located in southern Germany. The comparison is executed through a semantic harmonization and a polygon preprocessing part that leads to an area related map comparison with a confusion matrix. Inconsistencies were previously solved in order to allow comparisons using kappa statistics after merging all polygons.
The results indicate that the kappa value indicates a substantial agreement between both datasets. The DLM data shows a large area is covered by farmland and forest. There are clear variations between each location. The forest area presents the highest accuracy and completeness (97.6%) and correctness (95.1%), while the farmland indicates a low completeness (45.9%) but a high correctness (94.8%). The western part of the study area is more urbanized and therefore well mapped. This may explain why the eastern section still lacks completeness. The quality of OSM land use and land cover features varies between the investigated classes.

nicholas-david commented 4 years ago

@carlhiggs @shiqin-liu I am currently pairing the data down to make it more of an apples to apples comparison. I wanted to check, however, what are the specific destinations we are looking for? For example, are we looking for supermarkets, convenience stores, open-air markets, etc?

shiqin-liu commented 4 years ago

For food destinations, in this project, we specifically look at two major categories, fresh food market and convenience stores. So you can count supermarkets, convenience stores, grocery, open-air markets, etc. The destination categories names in the official data are not fully aligned with what we used to retrieve the OSM destinations dataset(using the key-value tag system). So maybe you could conceptually/subjectively pair official data that you think may fit into the fresh food and convenience store categories. Or if you are uncertain about any specific one, we could discuss it together.

I have found a document that carl has shared before about the definition used to search the OSM destinations dataset for each city, I think that's a good reference to look at. Carl_citiesdestination_20191018_no_formulas.xlsx. (I will send it to you via email)

@carlhiggs maybe you could share the city-specific reports that you created to send out to the collaborators? I remember those contains the destination OSM tags that you defined for each city as well as the maps showing those destinations. I cannot find the shared access to those files...

carlhiggs commented 4 years ago

Hi both, @nicholas-david @shiqin-liu , I just shared the link via an e-mail you should receive shortly

shiqin-liu commented 4 years ago

Hi, all, as I am checking on GTFS and OSM public transport POIs, I thought of validating OSM transport points with GTFS datasets applying your current destination validation method. But later I realized GTFS may not be viewed as official datasets (?). I hesitate about using GTFS data to "validate" OSM public transport data (as part of phase II validation plan), since our initial plan is to use GTFS frequent stop data whenever that is available, and the rest with OSM public transport data.

It would still be interesting to compare - % sample points / population estimated to have access to public transport stops using datasets derived from OpenStreetMap and from GTFS. I think this is what the USC team has been working on lately for validating destinations at the sample point/hex level? Do you think I could apply your validation method once developed to compare OSM and GTFS public transport POIs? But probably we would not view this as part of the validation work? I am not sure, would love to hear your thoughts!

gboeing commented 4 years ago

Do you think I could apply your validation method once developed to compare OSM and GTFS public transport POIs?

@shiqin-liu yes I think so. The code should be relatively easily adapted. You can find it here.

gboeing commented 4 years ago

I'm going to hold this issue open until the end of this week. Coming out of our call this morning there are two final action items prior to closing this:

  1. @nicholas-david and @DRosasMoctezuma will generate simple "validation results metadata" tables in the readme files to document the validation results fields and their interpretation.
  2. @nicholas-david and @DRosasMoctezuma will each test and inspect the validation script that the other led to make final improvements for script validity, efficiency, and clarity.

I believe this issue will be ready to close when that's done!

gboeing commented 4 years ago

I'm marking this as "complete" now. Thanks team!