ushahidi / platform

Ushahidi Platform API version 3+
http://ushahidi.com
Other
681 stars 506 forks source link

Capture Zoom Level (from map) for Reports #4935

Closed tuxpiper closed 6 months ago

tuxpiper commented 6 months ago

This task is to capture as much information as possible about the level of geographic granularity that was used to geocode a report. Ideally, reports that are geocoded using the map interface would capture the zoom level at the time the report was made. If reports were made from smartphones, the report would contain information about this; whether or not GPS was enabled; etc.

Why is this important? To understand this we need to briefly discuss “scale.” For geographers “scale” can be most easily thought of as zoom level. In it’s broadest sense the term means the level of detail. The major problem with Ushahidi’s most common output - a table of reports with associated latitude and longitude - is that all reports are treated as “equal” even though they are not.

For example: If User X wants to make a report about the great BBQ restaurant near David’s office, they would zoom into the exact building, and place their marker on that building. On the other hand, User Y might want to make the same report but can’t quite remember where the exact building was, but still wants to report that there is a good BBQ restaurant in the general vicinity of David’s office. User Y may zoom into the general vicinity and place their marker at random.

The problem is that the act of placing a marker defines an exact location in geographic space (lattitude/longitude) whether or not that marker is meant to identify that exact location or is simply marking a general area. As an end-consumer of Ushahidi data I have no way of differentiating these two reports but the difference in crucial:

User X is providing immediately actionable information (e.g. I can go to that exact BBQ restaurant) while User Y is providing general information that requires additional action (“there’s a good BBQ restaurant around this area but you’ll need to ask people in the neighborhood or do further research to find the exact address”).

Anyone who is trying to respond to a report in a humanitarian context will want to know the difference between these two reports. Providing the zoom level of each report would be an important piece of information. I could reasonably assume that reports entered using a lower zoom level (e.g. zoom level 0, or the entire Earth) are very general in nature: often signifying something happening within the country. On the other hand, reports entered at a higher zoom level (e.g. zoom level 16) would be much more precise and refer to very specific events or places.

Ultimately, the challenge is that a specific point (latitude/longitude) should only be used to represent other specific points in space (e.g. a well, a building, an intersection) and is not an appropriate way to signify general areas. Area is typically represented using a polygon in most geographic data. However, the simplicity of point data means that they are pressed into this service very often.

By capturing the zoom level employed at the time the report was made would, at least, help GIS analysts sort reports based on their granularity. Other datasets have dealt with ambiguous point data by adopting a “precision code” that quantitatively or qualitatively attempts to describe the scale at which each point was created. See this example of precision codes as part of point data to describe the location of humanitarian and development project locations.

Created by Shadrock on 2014-05-14 08:32:44.

_Imported from https://phabricator.ushahidi.com/T248_

Aha! Link: https://ushahiditeam.aha.io/features/PROD-736

View original issue in GitHub


ushbot 2015-08-23

Comment by Shadrock on 2014-11-26 17:53:27:

Folks: I'd really like to see this get higher priority. To summarize, very bluntly, what I explained in the description above: without this feature, the geographic information provided by an Ushahidi instance will remain largely unusable.

ushbot 2015-08-23

Comment by @rjmackay on 2014-11-27 09:02:33:

ping @anarghya @shadowhand if you want to include/prioritize this somewhere. I lowered priority because we're still trying to get a product out the door and I don't think we have bandwidth to cover all the minor details. That said.. maybe it makes more sense to just merge this into the posts create/edit work and make sure we get this right the first time?

ushbot 2015-08-23

Comment by Shadrock on 2014-11-27 16:34:09:

Thanks Robbie. I understand juggling priorities, but feel like I really feel like this is more than a minor detail. Ushahidi prioritizes maps, but they remain largely for show. I'll be curious to see what the others think. I'm happy to contribute to this task however I can to help move it along. Thanks again for the quick response.

ushbot 2015-08-23

Comment by shadowhand on 2014-11-27 17:20:03:

@shadrock

By capturing the zoom level employed at the time the report was made would, at least, help GIS analysts sort reports based on their granularity.

i'm not at all convinced that zoom level is a proper indicator of granularity. fwiw, we currently have multiple kinds of "location" input: lat/lon points, as well as geometries. wouldn't a more accurate approach include a user friendly way to apply either a shape (area) or a point as the location identifier?

ushbot 2015-08-23

Comment by Shadrock on 2014-12-05 16:43:40:

I think I’m not explaining something very well here. Let me try again.

The problem is that zoom (scale) affects point and vector (geometry) data equally: they are both directly tied to the scale at which they are created. Allowing a user to differentiate between when it’s appropriate to use a point to identify something and when it’s appropriate to use geometry certainly is important, but it doesn’t resolve questions about precision. I’ll respond to each point separately:

Allowing user to choose point v. geometry

The discussion about including geometry in Ushahidi began back in 2010 when I started making noise about it during Haiti. At that time, I was working with Ros Sewell and Patrick Meier on the issue ([[ https://www.youtube.com/watch?v=eRdNUAqEiIU | I briefly mention it here ]]) and Ushahidi only allowed points, [[ http://resources.arcgis.com/en/help/getting-started/articles/026n0000000n000000.htm | which is only 1 of 3 primary ways to represent geography data visually ]]. The data I was working with really was most appropriately represented with polygons.

Eventually, geometry was added to the platform, which is great. But that still doesn’t address the issue of precision. I wasn’t the only person to notice this. The need to improve the accuracy of geographic precision and generally improve metadata came up in most internal and external reviews and case studies of that time… [[ http://www.ushahidi.com/2012/03/20/predicting-locations-of-emergency-damage-during-disaster-using-vgi-data/ | including on our own blog ]]. It’s an issue that continues to come up.

Zoom as indicator of precision

If zoom isn’t a proper indicator of granularity what would be? Zoom is, in fact, an accepted standard indicator of granularity in GIS workflows because you can’t map what you can’t see.

Geographers use it all the time as a standard part of quality control for both point and vector (geometry) data. Best practices in GIS metadata creation specifically ask that it is included and [[ http://wiki.openstreetmap.org/wiki/Zoom_levels | OpenStreetMap explicitly recognizes that different level of zoom correspond to different levels of geographic precision ]].

Simple test: Choose your favorite mapping platform and try to drop a pin on your house from zoom level 5 (no cheating!). Then from zoom level 10. Finally from zoom level 15. Which one is closest? Which one of those pins is more granular in its precision? Now, if you do that same thing in Ushahidi (make a report about your house). The report at zoom level 15 is clearly more precise (e.g. it’s closest to your actual house) but anyone using the data can’t possibly know that since each one of the reports will result in a lat/long: so there’s no way to distinguish which one was made at the greater level of zoom/granularity. I'm getting around this on the [[ http://rni.ushahidi.com/ | RNI deployment ]] right now buy using a [[ http://iatistandard.org/codelists/GeographicalPrecision/ | standard (if somewhat subjective) indicator of precision ]] as part of the report form precisely //because the platform doesn't already capture this information//. This is the voice of dogfood talking!

Note that I’m talking about granularity of precision (ie. spatial precision) not the accuracy of identifying what something is (e.g. whether or not a building is a hospital) – you can find [[ http://www.colorado.edu/geography/gcraft/notes/error/error_f.html | more on this distinction here ]]. So if your concern is whether or not the report is accurate… you’re correct, zoom level will not help with that. However, zoom level will distinguish those reports that were placed with greater //intention to be precise// (e.g. the user actually zoomed in to try and find the right building versus staying zoomed out at a “city” level).

As both a GIS project manager and a GIS grunt, I’ve had to develop, or follow, rules about the level of zoom that production staff could use to ensure a) the appropriate level of precision and b) uniformity across the data set. This is well documented in GIS literature (academic and trade) under the “minimum mapping unit” and “scale.” So, just to be clear, GIS professionals already use zoom as an indicator of precision all the time. It is an accepted industry best practice (if not standard).

Moreover, as a deployer and volunteer on several Ushahidi instances over the years, I’ve had the opportunity to observe how users interact with the platform and craft instructions specifically addressing these issues. In the vast majority of cases, what I’m saying here holds true: users who are trying to be more precise zoom in farther than those who are trying to be general. Yes, there will be some who zoom in, get frustrated, and just drop a pin anyways, but in my experience this is far less common. This, actually, gets to another issue with Ushahidi: our reliance on other people's basemaps but that's another post...

We’ve tried different things on different deployments. On one (I think it was the Sudan referendum) we gave users a checkbox for granularity: they could choose if their report was at the “city” level; the “neighborhood level”; or “an exact location.” The primary problem is that all of those terms are subjective: there is no one definition of what a city is, much less what that would mean in terms of true, Euclidian, space. Same with “neighborhood.” For somebody in one city it might be a few blocks, while in NYC it might be a whole borough. The other problem was that nobody checked the boxes... just one extra thing to do.

Is this making sense? Sometimes it’s a lot easier to show in person. Perhaps a good argument for me to devote some time to a quick GIS talk at the team meeting… if it wouldn’t bore too many people to death!

ushbot 2015-08-23

Comment by @rjmackay on 2014-12-05 19:39:32:

Thanks for the super write up.

The problem is that zoom (scale) affects point and vector (geometry) data equally: they are both directly tied to the scale at which they are created

I get that. More so now you've explained. As I said.. I didn't drop priority because this isn't a thing we need.. but because we're still dealing with a platform that just doesn't even work yet. But given we're building the location editors again.. this should be easy enough to include. I'll add it as a blocker for the post create/edit forms.

If zoom isn’t a proper indicator of granularity what would be? Zoom is, in fact, an accepted standard indicator of granularity in GIS workflows because you can’t map what you can’t see.

I guess the follow up to this is.. given a deployment will contain a mix of data collected at various different zoom levels. How do we combine that in a sane way?

Perhaps a good argument for me to devote some time to a quick GIS talk at the team meeting… if it wouldn’t bore too many people to death!

I'll happily spend an entire day talking GIS :) There's only so much I can teach myself..

ushbot 2015-08-23

Comment by Shadrock on 2014-12-06 01:17:26:

I didn't drop priority because this isn't a thing we need..

I totally understand. And I appreciate using Phab as a place to have these discussions: it gives folks an opportunity to plead their case as the devs dig in and make hard choices. So that's what I'm doing here: making my case to ya'll.

How do we combine [zoom levels] in a sane way?

Well, the short answer is, "we don't: that's the deployers responsibility." If we can, indeed, include the zoom level capture for each report then end-users of a deployment's data can at least triage and aggregate reports by some indicator (or proxy) of precision. That goes a long way towards helping. After that, task T264 (associated with this one) would probably be the best answer: give deployers a functionality that allows them to set zoom limits that enforce certain levels of precision. There are already other companies out there who are acknowledging this as an important function and allow setting the zoom for viewing or editing ([[ https://www.mapbox.com/tilemill/docs/guides/advanced-map-design/ | Mapbox does some of the more comprehensive work I've seen ]]). @shadowhand has marked T264 to a low priority, which I think is appropriate (even though it's associated) given the release schedule and, frankly, the fact that //eliminating// a user's functionality (zooming in our out) is not a very elegant solution given the constraints of the UI (as I've seen it). On a GIS production line, you don't limit the zoom of the analyst when they're creating point or vector data: they still need to zoom in and out to gain context for what they're doing (esp. important if tracing satellite imagery) but they have the discipline to actually create the data at a given scale. I honestly don't ever expect this to be the case for large Ushahidi deployments, but it's possible when only a few people are geocoding.

ushbot 2015-08-23

Comment by @rjmackay on 2014-12-08 11:38:33:

"we don't: that's the deployers responsibility."

That's what worries me. Most deployers don't have much or any GIS knowledge. As much as possible, we need to try and provide sane suggestions and defaults. Obviously collecting zoom level is the first step to be able to do sensible things with it.

I can imagine a few things we could do :

Obviously those still don't work for everyone and could still be hugely confusing :/

rjmackay 2015-08-26

@benstoltz this needs some work on both the API and frontend.. if you're keen to have a go at it let me know.

There's a lot of comments but the short version is:

benstoltz 2015-08-26

@rjmackay I'll definitely take it on!

Shadrock 2016-06-29

Did anything ever happen with this?

rjmackay 2016-07-03

Nope. We probably need to be saving a label for locations too .. Assigning to Jess for triage

Erioldoesdesign 2019-04-16

Via @rowasc

Basic series of tasks:

rowasc 2019-04-16

@Erioldoesdesign it might be important to spec the UX for this a bit before we get a dev on it

rowasc 2019-04-16

also also can you add "display the map position with the correct zoom level in the post card" or something like that in the basic series of tasks? thanks! @Erioldoesdesign

tuxpiper 2019-04-17

👍 about this. For a heavily geolocation oriented product, we sometimes do pretty light-hearted decisions around this data.

There is so much more that we can do, to enrich and make the geographical data that we manage more useful. I'm just going to suggest this now:

In the implementation, instead of just adding a column for the zoom level in the database, could we add a column for collection metadata (i.e. in JSON)?

We should try to save the method by which the coordinates were provided. I don't think it's always a click on the map. Sometimes it's GPS, or Geocoding. If it's a click on the map, then, yes, let's capture the zoom level.

{ "source": "map" , "sourceData": { "zoomLevel": 2 } }

and this would work for this issue.

For GPS, and Geocoding we could leave "sourceData" empty by now , and leave capturing and displaying that metadata (sooo much sweet metadata is available) as future work.

# this would be for browser or device location tracking function (GPS on paper, but not always)
{ "source": "locationTracking", "sourceData": { } }

# looking up an address in a geocoding database
{ "source": "geocoding", "sourceData": { } } 

Also, sometimes, we won't know the source, as it may be coming from an API client that doesn't collect this info (i.e. importing a data set from somewhere else)

tuxpiper 2019-04-17

We should hide this metadata in surveys with anonymization enabled, when the user viewing the data doesn't have enough privileges.

Erioldoesdesign 2019-05-14

Reading through this...

For example: If User X wants to make a report about the great BBQ restaurant near David’s office, they would zoom into the exact building, and place their marker on that building. On the other hand, User Y might want to make the same report but can’t quite remember where the exact building was, but still wants to report that there is a good BBQ restaurant in the general vicinity of David’s office. User Y may zoom into the general vicinity and place their marker at random.

Would asking users with a radio button choice on the location capture field something to the affect of:

Go some way to helping at least, differentiate the location data between 'general' and 'exact' and when the general is selected, the reports get a 'lower priority' in an export or at least an automatic tag that reads something like 'Location Vague'

Exact reports could have a zoom level that is closer and 'vague' or 'general' could have a further out zoom level.

It kind of draws upon this:

We’ve tried different things on different deployments. On one (I think it was the Sudan referendum) we gave users a checkbox for granularity: they could choose if their report was at the “city” level; the “neighborhood level”; or “an exact location.” The primary problem is that all of those terms are subjective:

But doesn't make an assumption that the user needs to know the definition of city/town/neighbourhood etc. What the new c=radio button/check box would indicate is the users degree of comfort in self-declaring how accurate they believe they are. 'Vague' ticked would be a good indicator for any deployment owners to know what they need to follow up on because a user has self-described that they are not sure.

Also this:

Eventually, geometry was added to the platform, which is great.

Actually doesn't exist right? We can 'draw' a boundary layer for location that captures a 'bounded area'

This should be in a separate ticket somewhere and should be something of a priority.

I don't quite parse what @tuxpiper is saying in as far as the metadata/JSON stuff but in general, capturing and displaying what the deployment is doing and allowing for deployment owners to learn how to work with that is A++ in my UX books.

Also 100% agree this should be hidden for anon settings

Also geez, like why so much questioning re. zoom level as a measurement of accuracy. It's better than the zilch-o we have now.

tuxpiper 2019-05-14

I don't quite parse what @tuxpiper is saying in as far as the metadata/JSON stuff but in general, capturing and displaying what the deployment is doing and allowing for deployment owners to learn how to work with that is A++ in my UX books.

In retrospect, my comment probably amounted to a lot of blabber for saying a very simple thing: let's make database storage of geographic data as unconstrained as possible (store JSON)... so that any minute detail about geo data that may come up in the future can be stored there.

Erioldoesdesign 2019-05-15

I love and appreciate the blabber though! I learn more developer knowledge through that kind of osmosis

rowasc 2019-06-09

Related https://github.com/ushahidi/platform/issues/608

renujain31 2020-10-14

Hey @rowasc, I would like to take up this issue! Can you assign this to me!

rowasc 2020-10-16

@renujain31 hey I assigned you another one a minute ago, but we can come back to this once that's done!

renujain31 2020-10-21

Hey @rowasc I almost completed that one and sent a PR. Can I now work on this one?

renujain31 2020-10-28

Hey @rowasc, I researched a bit about this issue and it appears to me that first of all I need to make sure that the platform-client passes the zoomlevel in case when there's a click on the map to the API (Would need to add some code here: modify/location.directive.js) Also I would need to create a field where the admin can set the maximum and minimum zoom levels permissible?(This part is related to #608?)

I would then need to ensure that API stores the data as JSON of {latitude, longitude , zoomlevel} (Would involve changes in Platform API in files where posts are involved. Like this: Listener/CreatePostFromMessage.php)

Then we need to abstract out some details according to user's permissions and need to sort them in some order(which order is what I don't know at the present? But maybe we can sort it on the amount of precision/zoom level?) and then pass it to the user? (Changes in platform-client)

tuxpiper 2020-11-03

Hi @renujain31 , thank you for your points!

A few observations:

I would then need to ensure that API stores the data as JSON of {latitude, longitude , zoomlevel}

I don't think it's so good to bundle together all that information in a single JSON object.

Presently latitude and longitude are stored in a MySQL POINT column, which is good , because it is a native datatype and allows meaningful indexing. We would lose that by moving them into an arbitrary JSON object.

Also I would need to create a field where the admin can set the maximum and minimum zoom levels permissible?(This part is related to #608?)

I think #608 should be tackled separately for simplicity, but ... yes, I think this would be additional fields in the general settings. I am not sure what is the most user friendly way of doing this though.

Then we need to abstract out some details according to user's permissions and need to sort them in some order(which order is what I don't know at the present? But maybe we can sort it on the amount of precision/zoom level?) and then pass it to the user? (Changes in platform-client)

I am not entirely sure what this is about, what is the intended behavior / user experience with this?

renujain31 2020-11-06

Hi @tuxpiper Thanks for the help. Few concerns that I have:

I don't think it's so good to bundle together all that information in a single JSON object.

Presently latitude and longitude are stored in a MySQL POINT column, which is good , because it is a native datatype and allows meaningful indexing. We would lose that by moving them into an arbitrary JSON object.

But then we would have to change the schema of the SQL table used every time we need to add new values in zoom level? If this is something which can be done without getting any breaking, then I think it would be feasible to add a new column for zoom level. It would be great if you can please reference me to a pull request or a commit where we have had added a new column in a table if possible as it can help me in knowing what all things we have to take care of while adding a new field.

An alternate way where we can consider adding JSON can be that we can fix the JSON structure while adding it in the table and ensure that details are being added in that order only. By this way we can also get the meaningful indexing in the JSON.

I think #608 should be tackled separately for simplicity, but ... yes, I think this would be additional fields in the general settings. I am not sure what is the most user friendly way of doing this though.

Oh okay! Would first work on this and then would move on to the other issue.

I am not entirely sure what this is about, what is the intended behavior / user experience with this?

I think you mentioned here that we need to abstract out the details. Also @Erioldoesdesign have had added this (here)

Also 100% agree this should be hidden for anon settings

Please correct me if I am going in the wrong direction.

tuxpiper 2020-11-09

@renujain31

An alternate way where we can consider adding JSON can be that we can fix the JSON structure while adding it in the table and ensure that details are being added in that order only. By this way we can also get the meaningful indexing in the JSON.

nifty thinking 😁 But it's not really the same sort of indexing. With geographic data types you get spatial indexing, which means that it's really fast to do things like finding which rows have points contained within a given spatial rectangle.

With text-storage based values, we get something like lexicographical indexing, which only understands ordering of letters and numbers

But then we would have to change the schema of the SQL table used every time we need to add new values in zoom level?

Mmm no, I don't see why... I just mean to keep the current value column definition in storage, with its current mysql data type.

I'm hoping that the storing of zoom level could be done separately from that column, in a way that doesn't require constant schema changes ... even as other sorts of metadata are considered (i.e. adding source information: specifying whether the coordinates came from GPS, or were selected by the user, or were the result of a geo coding query)

linear[bot] commented 6 months ago

PLAT-5047 Capture Zoom Level (from map) for Reports

Shadrock commented 6 months ago

Woah, did this actually get completed? It would be great to confirm!