ngageoint / hootenanny

Hootenanny conflates multiple maps into a single seamless map.
GNU General Public License v3.0
354 stars 74 forks source link

Add option to diff conflate to not treat reviews as matches #3387

Closed bwitham closed 4 years ago

bwitham commented 5 years ago

https://github.com/DigitalGlobe/VGI-team-repo/issues/1955 - Differential w/ tags and separate changeset outputs

Several POIs are being marked for review that are fairly far apart for an urban environment, have the same base restaurant type but do not have similar names or cuisines. We may be able to adjust POI to POI to prevent these reviews, which will lead to all or most of them passing through to the conflated output.

bwitham commented 5 years ago

@curranMapper This is happening because there is a single restaurant in the NOME layer which is triggering reviews against every POI in that area of the OSM dataset. Since all matches get dropped by diff conflate, we lose all of those OSM restaurants.

POI conflation is pretty lenient on reviews in general and the match distance fairly large. So, Monday I'll look into making the POI matching more strict for this use case, as the leniency is causing features to get dropped from the output.

curranMapper commented 5 years ago

In thinking about other ways user could work through this, advanced options has some "distance thresholding" settings that can be manipulated. This seems to be for POI to polygon situations only, and not POI to POI.

bwitham commented 5 years ago

1) Conflate distance thresholds aren't user editable for POI to POI due to the way it was implemented. Under the covers each POI type has its own maximum match/review distance. Its possible we could expose that in the UI, but that could lead to added confusion...I don't know.

2) Another option for this particular use case would be to add an option to diff conflate to not treat reviews as matches and thus, not throw them out of the diff. If that was done, then in this case all of the restaurants in question would pass through to output.

3) POI to POI conflation, like most of hoot, is set up to minimize the chance of missing a match (high negative business cost) so it errs on the side of generating a lot of reviews when it unsure. I think in this example the reviews aren't great. Basically, all of the POIs in question are restaurants with dissimilar names and mostly with different cuisine. I think at most maybe a couple of them should be reviewed. So, it may be we need to rework POI to POI conflation a bit to be more strict in matching...or at least have an option for it to behave like that.

From what I mentioned, 1) is time consuming to implement, 3) is very time consuming to implement, but 2) wouldn't take too long. Not sure which of the options (or maybe all of them?) are the best idea at this point.

bwitham commented 5 years ago

@curranMapper What do you think?

curranMapper commented 5 years ago

Hey @bwitham, my input below.

  1. agree, this could cause more confusion
  2. in this case clean-up would be needed for potential duplicates getting through that ordinarily would have been reviewed.
  3. this seems like it may have long term improvement/benefits to poi to poi

I do see utility to no 2 as option, however, would this behavior be unique to POIs only? It would pass along POIs normally flagged as review but wouldn't pass through roads flagged for review as example? Also considering benefit of keeping behavior consistent across different features so users can clearly understand "how it works".

I would vote for seeing more use cases highlighting this as "issue" to justify benefit of adding no 2 to the mix and keep with 3 as long term goal in order to keep behavior consistent across features.

Will also bring this up to users as applicable to see if need develops to have no 2 implemented.

Does that make sense to you? others thoughts on this?

bwitham commented 5 years ago

For 2), it could be done either way. Either just pass through POIs marked for review or pass through everything marked for review. Yeah, there's no point in adding 2) unless we're sure it provides utility, as it will just add to complexity in the app if it doesn't. So, we can think about it for awhile.

I think regarding 3) that I need to make POI to POI behave a little more like POI to Polygon in the strictness of name/type matching. If this example was POIs and Polys with the same attribution, I'm not sure we'd have these not so great reviews. 3) may not be terribly time consuming to make some adjustments to POI to POI to solve this problem. I'll work on that effort as part of this issue.

bwitham commented 4 years ago

I'm going to implement 2) as an option turned off by default that we can experiment with for this. Created #3579 for 3) as there's much more work required for it.