Open XioNoX opened 9 months ago
FYI, I think MapRoulette is a great candidate to become our shared source and tool to check of lists of OpenData sources. https://github.com/maproulette/maproulette3/issues/1737 tracks the integration in mobile editors.
It is likely that I will integrate something like that, but based on ATP data (I am awaiting final decision whether such project will be funded).
I am not fan of idea of working with Osmose due to some false positives being unfixed for long time there and some low quality tagging advise being pushed there.
OSMOSE handles 3rd party OpenData sources for integration.
What is their process for ensuring that only data on license compatible with OSM ends there? Where it is documented?
Can you link to relevant documentation required as documented at https://wiki.openstreetmap.org/wiki/Import/Guidelines ? (I am not entirely sure is it needed in this case, but probably yes and I plan to do it for SC if I end implementing this feature)
MapRoulette
Note that MapRoulette is unsuitable for cases requiring on the ground verification. There is huge risk that one of mass clickers will join and mark all entries as verified without doing any verification whatsoever.
Though for bicycle parkings some areas may have some verifiable based on aerial (high quality aerial, no trees or other cover).
If StreetComplete were to handle more of those OSMOSE OpenData "merge", each new category (item) would need to be reviewed independently.
Is this data being reviewed already for quality? If yes, where it is happening and where this process is described?
If no - then how sources with garbage data quality are avoided?
Note that MapRoulette is unsuitable for cases requiring on the ground verification.
I disagree. MapRoulette is a technical system that allows to work down a list of tasks. It is the responsibility of the person that creates those tasks to build and word them in a way that works for the intended use case. It is absolutely possible to use it in a mobile editor and a mobile context to work on a hyper local dataset with ground verification.
Thanks for your quick replies ! I don't know enough of maproulette to comment, however after exploring OSMOSE it seemed like a great match for the reasons listed previously. It also might make sens to me to keep that ticket focused on OSMOSE.
but based on ATP data
What is that ?
I am not fan of idea of working with Osmose due to some false positives being unfixed for long time there and some low quality tagging advise being pushed there.
I think we need to decouple OSMOSE the platform, from the various QA rules and merge data. I'm absolutely not advocating from displaying all the OSMOSE "issues" in StreetComplete, only a few, after a thorough review.
What is their process for ensuring that only data on license compatible with OSM ends there? Where it is documented?
My understanding is that it's done during the code review when ingesting a new data source, see for example the doc on https://github.com/osm-fr/osmose-backend/blob/main/doc/4-Merge.md#opendata-set-source
On one side using the attribution
field, and on the other manually reviewing the source's license.
I'd assume that's it's a solved issue as OSMOSE is exclusively used for OSM.
Is this data being reviewed already for quality? If yes, where it is happening and where this process is described? If no - then how sources with garbage data quality are avoided?
For my current (limited) experience, during the code review, and before the code is merged, a geojson of the "issues" is generated and manually reviewed to make sure the output is correct and no issues are present in the code. So they're really added on a case by case basis. I'm of the opinion that it would be important to re-check the data before adding it to StreetComplete on a "item/category" basis to make sure the garbage levels are at a minimum.
FYI, I think MapRoulette is a great candidate to become our shared source and tool to check of lists of OpenData sources. https://github.com/maproulette/maproulette3/issues/1737 tracks the integration in mobile editors.
Osmose is not just about the list of objects to be checked, but also the process of download/update the opendata set, map the properties and values to OSM tags, run the conflation every day and show the results.
Osmose also provide an export to MapRoulette. Challanges can be created from Osmose. The two tools are complementary.
I am not fan of idea of working with Osmose due to some false positives being unfixed for long time there and some low quality tagging advise being pushed there.
As I already say you, please report issues. Other Osmose contributors and I really pay attention to tags to what is suggested to contributors. Event if contributors area responsible of what they push to OSM, Osmose may introduce errors or bias, and we try to avoid this as possible.
What is their process for ensuring that only data on license compatible with OSM ends there? Where it is documented?
Sure, each dataset used is compliant with OSM licence.
Is this data being reviewed already for quality? If yes, where it is happening and where this process is described?
Yes the data configured is Osmose is always checked for quality.
Can you link to relevant documentation required as documented at https://wiki.openstreetmap.org/wiki/Import/Guidelines ? (I am not entirely sure is it needed in this case, but probably yes and I plan to do it for SC if I end implementing this feature)
That a real question, some local community say it is ok, other the opposite. It also depends if the data was already importes or not, and Osmose OpenData merge used just for gardening.
ATP
What is that ?
https://github.com/alltheplaces/alltheplaces
Sure, each dataset used is compliant with OSM licence.
Yes the data configured is Osmose is always checked for quality.
Do you know where this review is happening? Is outcome/recording publicly accessible somewhere?
If not, do you know where list of resources being imported is listed?
I would be really happy to use such datasets but I would not use it blindly and would at least verify that they are safe to use. I feel responsible for how code I write or deploy will be used (disclaimer: if user is malicious or careless then damage is possible and I would not feel responsible at all for it, and would refer such case to DWG for blocking).
So I would want to be sure that for example there is no CC-BY-SA data without waiver there and so on.
As I already say you, please report issues. Other Osmose contributors and I really pay attention to tags to what is suggested to contributors. Event if contributors area responsible of what they push to OSM, Osmose may introduce errors or bias, and we try to avoid this as possible.
I reported some (and many were fixed, thanks!) Though https://github.com/osm-fr/osmose-backend/issues/381 https://github.com/osm-fr/osmose-backend/issues/1094 https://github.com/osm-fr/osmose-backend/issues/1152 are waiting for quiet long time now and for example https://github.com/osm-fr/osmose-backend/issues/1159 got wontfixed.
I want to note that it is still one of better track records for software as far as my bug reporting goes, so maybe I have overly high expectations.
But as it is now I would definitely not treat Osmose advise as a good idea by default. I worry that the same can apply to datasets being suggested.
Osmose has some outright wrong advise and applying some of changes it suggests is a monumental waste of human time. And if some reports are not expected to be fixed manually then they should be in a separate category - at least I do it with https://matkoniecz.github.io/OSM-wikipedia-tag-validator-reports/Deutschland.html / https://matkoniecz.github.io/OSM-wikipedia-tag-validator-reports/Deutschland%20-%20obvious.html where humans are shown only cases where human review is worth doing. While large part of Osmose looks like request to manually do a bot edit.
That a real question, some local community say it is ok, other the opposite. It also depends if the data was already importes or not, and Osmose OpenData merge used just for gardening.
If deployed with StreetComplete on global scale it almost certainly would need to be done, probably per dataset. Or at least per group.
I hope that with ATP it can be done in general due to the same methodology, but maybe also there community will expect separate import process for each dataset.
There is more than 100 opendata datasets configured in Osmose. But except few (maybe less than 5) there are all in France. In France there is no incompatible opendata dataset license with OSM (by law).
In Spain there is CC-BY-SA, but with explicit agreement for OSM (and I do not have others in mind).
The only global dataset is Mapillary imagery detection.
All OpenDataset sources are listed in configuration/python files (analysers/analysermerge*.py)
For those interested, SCEE "Expert Edition" fork of StreetComplete does have a quest for OSMOSE which might be useful:
IIRC, it only displays the OSMOSE quest and allows user to edit mark them as false positive, or edit raw OSM tags manually (or of course use other SCEE functionality, like Add POI
to add missing nodes).
if open data merge requests are used, it should be done in a way where mappers can set the location themselves (like with the poi overlay), ive seen plenty of osmose bugs for missing fire hydrants in vespucci, and they were never in the correct place.
if open data merge requests are used, it should be done in a way where mappers can set the location themselves (like with the poi overlay), ive seen plenty of osmose bugs for missing fire hydrants in vespucci, and they were never in the correct place.
Yes. In Osmose we include Opendata from good sources only. But the idea in Osmose is that require contributors review of location and tags (as opposite an initial import is more effective).
In France there is no incompatible opendata dataset license with OSM (by law).
Do you mean that open data releases by government have requirements that make it always usable in OSM as far as license goes?
But, still - even that does not mean that all releases describing itself as open data are usable in OSM. Some have (typically accidental) traps of various kinds, even if all involved are well meaning.
My understanding is that it's done during the code review when ingesting a new data source, see for example the doc on https://github.com/osm-fr/osmose-backend/blob/main/doc/4-Merge.md#opendata-set-source
@XioNoX do you know where this file is now residing?
Is this data being reviewed already for quality? If yes, where it is happening and where this process is described?
Yes the data configured is Osmose is always checked for quality.
I went looking at https://github.com/osm-fr/osmose-backend/tree/dev/analysers for random recently one and found https://github.com/osm-fr/osmose-backend/commit/c0c9c7806506289dd69528d3b8fd6475ebc46d83 linking back to https://forum.openstreetmap.fr/t/la-communaute-dagglo-niort-agglo-vient-de-publier-les-emplacements-des-parkings-a-velos/10765
Do you know whether LWG ever reviewed Licence Ouverte for ODBL compatibility? Note that seemingly compatible and open licences like CC BY 4.0 have subtle incompabilities and require a waiver.
@XioNoX do you know where this file is now residing?
Looks like it's over there now: https://github.com/osm-fr/osmose-backend/blob/dev/doc/4-Merge.md
Seems like the "License Ouverte" is compatible with OSM as long as you set the source in the changeset: https://wiki.openstreetmap.org/wiki/FR:OpenData
@frodrigo can you reply to https://github.com/streetcomplete/StreetComplete/issues/5481#issuecomment-2308898329 ( and maybe also https://github.com/streetcomplete/StreetComplete/issues/5481#issuecomment-2308894219 ?)
Has anyone asked LWG to review License Ouverte? Note that seemingly compatible and open licences like CC BY 4.0 have subtle incompabilities and require a waiver.
@matkoniecz I will not argue myself here about that. Report to the wiki, French community malling list and forum.
Hi,
In addition to running consistency checks on the OSM database, OSMOSE handles 3rd party OpenData sources for integration.
For example, implemented in https://github.com/osm-fr/osmose-backend/pull/2143 Here is a map of all the bicycle parking that are present in my city's OpenData but are not in OSM, as well as the ones that are present, but could be improved (eg. missing tags, like
capacity
) https://osmose.openstreetmap.fr/en/map/#source=448196&zoom=14&lat=48.39667&lon=-4.47183&item=xxxx&level=3 The same data as a table : https://osmose.openstreetmap.fr/en/issues/open?source=448196 Note that this url filters onsource=448196
but only filtering onitem=8150
would show the data for all sources and thus all cities.Additionally APIs are available, (and support bbox filtering), for example: https://osmose.openstreetmap.fr/api/0.3/issues?item=8150&bbox=-4.493634,48.382699,-4.483377,48.390152 for all the bicycle parking related "issues" in an area. A given "issue" : https://osmose.openstreetmap.fr/api/0.3/issue/86fe9024-0549-7f06-63e6-ecbd824896cb matching : https://osmose.openstreetmap.fr/en/issue/86fe9024-0549-7f06-63e6-ecbd824896cb Note that there are already some translations in. Full API doc: http://osmose.openstreetmap.fr/api/docs
Following up on this use-case, it would be extremely convenient to offer to StreetComplete users a "yes/no" quest such as "Is there a bicycle parking at this location" ? A
yes
would add the node and mark it as "resolved" on OSMOSE, ano
would mark it as "false positive".Multiple scenarios on which extra tags to add to the node:
amenity=bicycle_parking
and leaves it to the following quests to add the capacity or if it's coveredOf course the 3rd option makes it more complex to generically support OSMOSE mergers while keeping it user-friendly. This also raises the question on how to handle the cases where the node is already in OSM but is missing some tags (or they're invalid). We can filter those with
class=4
https://osmose.openstreetmap.fr/fr/issues/open?source=448196&class=4I'm using bicycle parking as example as they seem to be easy to work with, and are not just in France (thanks Madrid!): https://osmose.openstreetmap.fr/en/issues/open?item=8150
If StreetComplete were to handle more of those OSMOSE OpenData "merge", each new category (item) would need to be reviewed independently. Even though many parts of the automation/UI could be re-used. To have a more extensive view, you can filter on tag "merge" https://osmose.openstreetmap.fr/fr/issues/open?level=1,2,3&source=&class=3&tags=merge&username=&bbox=&limit=500 or with
&item=8xxx
unfortunately quality varies between the "mergers/integrations".In France, postboxes could be another good candidate : https://osmose.openstreetmap.fr/en/issue/c50fb129-d484-17ce-49c7-b49d73b4051e
What do you think ?