streetcomplete / StreetComplete

Easy to use OpenStreetMap editor for Android
https://streetcomplete.app
GNU General Public License v3.0
3.91k stars 358 forks source link

OSMOSE OpenData merge quests #5481

Open XioNoX opened 9 months ago

XioNoX commented 9 months ago

Hi,

In addition to running consistency checks on the OSM database, OSMOSE handles 3rd party OpenData sources for integration.

For example, implemented in https://github.com/osm-fr/osmose-backend/pull/2143 Here is a map of all the bicycle parking that are present in my city's OpenData but are not in OSM, as well as the ones that are present, but could be improved (eg. missing tags, like capacity) https://osmose.openstreetmap.fr/en/map/#source=448196&zoom=14&lat=48.39667&lon=-4.47183&item=xxxx&level=3 The same data as a table : https://osmose.openstreetmap.fr/en/issues/open?source=448196 Note that this url filters on source=448196 but only filtering on item=8150 would show the data for all sources and thus all cities.

Additionally APIs are available, (and support bbox filtering), for example: https://osmose.openstreetmap.fr/api/0.3/issues?item=8150&bbox=-4.493634,48.382699,-4.483377,48.390152 for all the bicycle parking related "issues" in an area. A given "issue" : https://osmose.openstreetmap.fr/api/0.3/issue/86fe9024-0549-7f06-63e6-ecbd824896cb matching : https://osmose.openstreetmap.fr/en/issue/86fe9024-0549-7f06-63e6-ecbd824896cb Note that there are already some translations in. Full API doc: http://osmose.openstreetmap.fr/api/docs

Following up on this use-case, it would be extremely convenient to offer to StreetComplete users a "yes/no" quest such as "Is there a bicycle parking at this location" ? A yes would add the node and mark it as "resolved" on OSMOSE, a no would mark it as "false positive".

Multiple scenarios on which extra tags to add to the node:

Of course the 3rd option makes it more complex to generically support OSMOSE mergers while keeping it user-friendly. This also raises the question on how to handle the cases where the node is already in OSM but is missing some tags (or they're invalid). We can filter those with class=4 https://osmose.openstreetmap.fr/fr/issues/open?source=448196&class=4

I'm using bicycle parking as example as they seem to be easy to work with, and are not just in France (thanks Madrid!): https://osmose.openstreetmap.fr/en/issues/open?item=8150

If StreetComplete were to handle more of those OSMOSE OpenData "merge", each new category (item) would need to be reviewed independently. Even though many parts of the automation/UI could be re-used. To have a more extensive view, you can filter on tag "merge" https://osmose.openstreetmap.fr/fr/issues/open?level=1,2,3&source=&class=3&tags=merge&username=&bbox=&limit=500 or with &item=8xxx unfortunately quality varies between the "mergers/integrations".

In France, postboxes could be another good candidate : https://osmose.openstreetmap.fr/en/issue/c50fb129-d484-17ce-49c7-b49d73b4051e

What do you think ?

tordans commented 9 months ago

FYI, I think MapRoulette is a great candidate to become our shared source and tool to check of lists of OpenData sources. https://github.com/maproulette/maproulette3/issues/1737 tracks the integration in mobile editors.

matkoniecz commented 9 months ago

It is likely that I will integrate something like that, but based on ATP data (I am awaiting final decision whether such project will be funded).

I am not fan of idea of working with Osmose due to some false positives being unfixed for long time there and some low quality tagging advise being pushed there.

matkoniecz commented 9 months ago

OSMOSE handles 3rd party OpenData sources for integration.

What is their process for ensuring that only data on license compatible with OSM ends there? Where it is documented?

Can you link to relevant documentation required as documented at https://wiki.openstreetmap.org/wiki/Import/Guidelines ? (I am not entirely sure is it needed in this case, but probably yes and I plan to do it for SC if I end implementing this feature)

matkoniecz commented 9 months ago

MapRoulette

Note that MapRoulette is unsuitable for cases requiring on the ground verification. There is huge risk that one of mass clickers will join and mark all entries as verified without doing any verification whatsoever.

Though for bicycle parkings some areas may have some verifiable based on aerial (high quality aerial, no trees or other cover).

matkoniecz commented 9 months ago

If StreetComplete were to handle more of those OSMOSE OpenData "merge", each new category (item) would need to be reviewed independently.

Is this data being reviewed already for quality? If yes, where it is happening and where this process is described?

If no - then how sources with garbage data quality are avoided?

tordans commented 9 months ago

Note that MapRoulette is unsuitable for cases requiring on the ground verification.

I disagree. MapRoulette is a technical system that allows to work down a list of tasks. It is the responsibility of the person that creates those tasks to build and word them in a way that works for the intended use case. It is absolutely possible to use it in a mobile editor and a mobile context to work on a hyper local dataset with ground verification.

XioNoX commented 9 months ago

Thanks for your quick replies ! I don't know enough of maproulette to comment, however after exploring OSMOSE it seemed like a great match for the reasons listed previously. It also might make sens to me to keep that ticket focused on OSMOSE.

but based on ATP data

What is that ?

I am not fan of idea of working with Osmose due to some false positives being unfixed for long time there and some low quality tagging advise being pushed there.

I think we need to decouple OSMOSE the platform, from the various QA rules and merge data. I'm absolutely not advocating from displaying all the OSMOSE "issues" in StreetComplete, only a few, after a thorough review.

What is their process for ensuring that only data on license compatible with OSM ends there? Where it is documented?

My understanding is that it's done during the code review when ingesting a new data source, see for example the doc on https://github.com/osm-fr/osmose-backend/blob/main/doc/4-Merge.md#opendata-set-source On one side using the attribution field, and on the other manually reviewing the source's license. I'd assume that's it's a solved issue as OSMOSE is exclusively used for OSM.

Is this data being reviewed already for quality? If yes, where it is happening and where this process is described? If no - then how sources with garbage data quality are avoided?

For my current (limited) experience, during the code review, and before the code is merged, a geojson of the "issues" is generated and manually reviewed to make sure the output is correct and no issues are present in the code. So they're really added on a case by case basis. I'm of the opinion that it would be important to re-check the data before adding it to StreetComplete on a "item/category" basis to make sure the garbage levels are at a minimum.

frodrigo commented 9 months ago

FYI, I think MapRoulette is a great candidate to become our shared source and tool to check of lists of OpenData sources. https://github.com/maproulette/maproulette3/issues/1737 tracks the integration in mobile editors.

Osmose is not just about the list of objects to be checked, but also the process of download/update the opendata set, map the properties and values to OSM tags, run the conflation every day and show the results.

Osmose also provide an export to MapRoulette. Challanges can be created from Osmose. The two tools are complementary.

I am not fan of idea of working with Osmose due to some false positives being unfixed for long time there and some low quality tagging advise being pushed there.

As I already say you, please report issues. Other Osmose contributors and I really pay attention to tags to what is suggested to contributors. Event if contributors area responsible of what they push to OSM, Osmose may introduce errors or bias, and we try to avoid this as possible.

What is their process for ensuring that only data on license compatible with OSM ends there? Where it is documented?

Sure, each dataset used is compliant with OSM licence.

Is this data being reviewed already for quality? If yes, where it is happening and where this process is described?

Yes the data configured is Osmose is always checked for quality.

Can you link to relevant documentation required as documented at https://wiki.openstreetmap.org/wiki/Import/Guidelines ? (I am not entirely sure is it needed in this case, but probably yes and I plan to do it for SC if I end implementing this feature)

That a real question, some local community say it is ok, other the opposite. It also depends if the data was already importes or not, and Osmose OpenData merge used just for gardening.

matkoniecz commented 9 months ago

ATP

What is that ?

https://github.com/alltheplaces/alltheplaces

Sure, each dataset used is compliant with OSM licence.

Yes the data configured is Osmose is always checked for quality.

Do you know where this review is happening? Is outcome/recording publicly accessible somewhere?

If not, do you know where list of resources being imported is listed?

I would be really happy to use such datasets but I would not use it blindly and would at least verify that they are safe to use. I feel responsible for how code I write or deploy will be used (disclaimer: if user is malicious or careless then damage is possible and I would not feel responsible at all for it, and would refer such case to DWG for blocking).

So I would want to be sure that for example there is no CC-BY-SA data without waiver there and so on.

As I already say you, please report issues. Other Osmose contributors and I really pay attention to tags to what is suggested to contributors. Event if contributors area responsible of what they push to OSM, Osmose may introduce errors or bias, and we try to avoid this as possible.

I reported some (and many were fixed, thanks!) Though https://github.com/osm-fr/osmose-backend/issues/381 https://github.com/osm-fr/osmose-backend/issues/1094 https://github.com/osm-fr/osmose-backend/issues/1152 are waiting for quiet long time now and for example https://github.com/osm-fr/osmose-backend/issues/1159 got wontfixed.

I want to note that it is still one of better track records for software as far as my bug reporting goes, so maybe I have overly high expectations.

But as it is now I would definitely not treat Osmose advise as a good idea by default. I worry that the same can apply to datasets being suggested.

Osmose has some outright wrong advise and applying some of changes it suggests is a monumental waste of human time. And if some reports are not expected to be fixed manually then they should be in a separate category - at least I do it with https://matkoniecz.github.io/OSM-wikipedia-tag-validator-reports/Deutschland.html / https://matkoniecz.github.io/OSM-wikipedia-tag-validator-reports/Deutschland%20-%20obvious.html where humans are shown only cases where human review is worth doing. While large part of Osmose looks like request to manually do a bot edit.

That a real question, some local community say it is ok, other the opposite. It also depends if the data was already importes or not, and Osmose OpenData merge used just for gardening.

If deployed with StreetComplete on global scale it almost certainly would need to be done, probably per dataset. Or at least per group.

I hope that with ATP it can be done in general due to the same methodology, but maybe also there community will expect separate import process for each dataset.

frodrigo commented 9 months ago

There is more than 100 opendata datasets configured in Osmose. But except few (maybe less than 5) there are all in France. In France there is no incompatible opendata dataset license with OSM (by law).

In Spain there is CC-BY-SA, but with explicit agreement for OSM (and I do not have others in mind).

The only global dataset is Mapillary imagery detection.

All OpenDataset sources are listed in configuration/python files (analysers/analysermerge*.py)

mnalis commented 9 months ago

For those interested, SCEE "Expert Edition" fork of StreetComplete does have a quest for OSMOSE which might be useful:

small_Screenshot_20240211_230413_SCEE

IIRC, it only displays the OSMOSE quest and allows user to edit mark them as false positive, or edit raw OSM tags manually (or of course use other SCEE functionality, like Add POI to add missing nodes).

u6aab commented 7 months ago

if open data merge requests are used, it should be done in a way where mappers can set the location themselves (like with the poi overlay), ive seen plenty of osmose bugs for missing fire hydrants in vespucci, and they were never in the correct place.

frodrigo commented 7 months ago

if open data merge requests are used, it should be done in a way where mappers can set the location themselves (like with the poi overlay), ive seen plenty of osmose bugs for missing fire hydrants in vespucci, and they were never in the correct place.

Yes. In Osmose we include Opendata from good sources only. But the idea in Osmose is that require contributors review of location and tags (as opposite an initial import is more effective).

matkoniecz commented 3 months ago

In France there is no incompatible opendata dataset license with OSM (by law).

Do you mean that open data releases by government have requirements that make it always usable in OSM as far as license goes?

But, still - even that does not mean that all releases describing itself as open data are usable in OSM. Some have (typically accidental) traps of various kinds, even if all involved are well meaning.

matkoniecz commented 3 months ago

My understanding is that it's done during the code review when ingesting a new data source, see for example the doc on https://github.com/osm-fr/osmose-backend/blob/main/doc/4-Merge.md#opendata-set-source

@XioNoX do you know where this file is now residing?

matkoniecz commented 3 months ago

Is this data being reviewed already for quality? If yes, where it is happening and where this process is described?

Yes the data configured is Osmose is always checked for quality.

I went looking at https://github.com/osm-fr/osmose-backend/tree/dev/analysers for random recently one and found https://github.com/osm-fr/osmose-backend/commit/c0c9c7806506289dd69528d3b8fd6475ebc46d83 linking back to https://forum.openstreetmap.fr/t/la-communaute-dagglo-niort-agglo-vient-de-publier-les-emplacements-des-parkings-a-velos/10765

Do you know whether LWG ever reviewed Licence Ouverte for ODBL compatibility? Note that seemingly compatible and open licences like CC BY 4.0 have subtle incompabilities and require a waiver.

XioNoX commented 2 months ago

@XioNoX do you know where this file is now residing?

Looks like it's over there now: https://github.com/osm-fr/osmose-backend/blob/dev/doc/4-Merge.md

Seems like the "License Ouverte" is compatible with OSM as long as you set the source in the changeset: https://wiki.openstreetmap.org/wiki/FR:OpenData

matkoniecz commented 1 month ago

@frodrigo can you reply to https://github.com/streetcomplete/StreetComplete/issues/5481#issuecomment-2308898329 ( and maybe also https://github.com/streetcomplete/StreetComplete/issues/5481#issuecomment-2308894219 ?)

Has anyone asked LWG to review License Ouverte? Note that seemingly compatible and open licences like CC BY 4.0 have subtle incompabilities and require a waiver.

frodrigo commented 1 month ago

@matkoniecz I will not argue myself here about that. Report to the wiki, French community malling list and forum.