openstreetmap / operations

OSMF Operations Working Group issue tracking
https://operations.osmfoundation.org/
98 stars 13 forks source link

Remove Wikibase extension from all OSM wikis #764

Open Firefishy opened 1 year ago

Firefishy commented 1 year ago

The Wikibase extension causes us endless compatibility issues with Mediawiki version.

It is overly complex to install and requires specific knowledge to manage.

I propose we remove the extension and restore the wiki back to standard functionality as best as possible.

not-my-profile commented 1 year ago

I think the two main data consumers of the OSM wiki wikibase data are the iD editor and Sophox. The iD editor displays descriptions and images from the data items as follows (when you click on the i):

image

and the :pen: links to the respective data item (in this case Q98). Sophox on the other hand has apparently not been importing updates to the data items in nearly 1.5 years, see https://github.com/Sophox/sophox/issues/27. Note that taginfo does not use the Wikibase data at all (and was not planning on supporting it either), see https://github.com/taginfo/taginfo/issues/248.

While I am a big fan of structured data and think that we should embrace it as much as possible for the documentation of our tagging conventions, I do think that Wikibase is a poor fit for our use case because it requires data items to have numeric IDs ... which does make sense for Wikidata but not for tags because keys and values are already unique identifiers. Unfortunately Wikibase really seems to be primarily targeted at Wikidata. Another example being that we cannot disable the label fields for data items and they're often mistakenly translated as well.

Lastly the current state of the data items is quite a mess because there are absolutely no mechanisms in place to synchronize template data with the data items ... it's all done manually. And when you create a new page on the wiki there's no data item created for the page so many tag pages don't even have data items.

So with these caveats in mind I guess it makes sense to abandon Wikibase if it is a maintenance burden. I think we'd definitely need to migrate the tag description translations somewhere else ... IMHO regular wiki pages are definitely not an option ... we'd need some service with a similarly userfriendly translation UI.

I think ideally we could switch to some Wikibase alternative that better fit our use case but I don't think there is any.

mmd-osm commented 1 year ago

there are absolutely no mechanisms in place to synchronize template data with the data items

IIRC, @nyurik used a bot called Yurikbot to do this kind of synchronization, but for some reason it's no longer active: https://wiki.openstreetmap.org/wiki/User_talk:Yurik#Yurikbot_2

AndrewHain commented 1 year ago

I would strongly be opposed to losing the knowledge base built up in the data items. Despite being hampered by opposition from one prolific and vocal wiki contributor and a loss of interest form Yuri, they are in better shape than the difficult-to-maintain arguments to the wiki templates. I also used data items effectively to stop the wiki renderer from choking.

If Wikibase is too hard to manage the most sensible alternative would be to remove tag documentation from the wiki entirely and move it to a new custom website.

SomeoneElseOSM commented 1 year ago

I would strongly be opposed to losing the knowledge base built up in the data items.

@AndrewHain Can you explain what this knowledge base is and how people can use it, to add to what @not-my-profile wrote above? This probably isn't best done here, but an OSM diary entry might be a good place for it.

Firefishy commented 1 year ago

I appreciate the extra detail being provided.

Firefishy commented 1 year ago

This ticket should not be considered a 100% decision point... It is more a cry of pain needing a solution 😜

1ec5 commented 1 year ago

The Wikibase extension causes us endless compatibility issues with Mediawiki version.

For the benefit of those coming into this discussion, can you elaborate on the specific issues you anticipate going forward if we keep Wikibase around? Are these compatibility issues with MediaWiki itself or with other extensions we have installed? Do we have a staging environment to test our MediaWiki configuration with Wikibase before deployment, to catch issues before they disrupt ordinary wiki usage?

I think the two main data consumers of the OSM wiki wikibase data are the iD editor and Sophox.

In addition, the wiki itself depends on data items in various ways. For example, when you search for a key that doesn’t have an article yet or land on a 404 page for that key, the wiki displays an infobox synthesized from that key’s data item. If it’s a compound key, the 404 page also includes a breakdown of the key based on the data item of each component.

This functionality was added in response to concern about having to maintain articles in each language about arbitrary combinations of key components. An alternative would be to populate these articles using a bot, but the descriptions would have to be maintained somewhere more machine-readable than infoboxes in wikitext, essentially reinventing data items.

iD consumes data items through the MediaWiki API. It’s a public API, so we don’t know for sure what other software (QA tools?) rely on it for descriptions, images, or statements about which element types are valid for a given tag. A breaking change of this magnitude needs to be discussed broadly, in a similar manner as if taginfo or Nominatim were to introduce a breaking change in its API.

Sophox on the other hand has apparently not been importing updates to the data items in nearly 1.5 years, see https://github.com/Sophox/sophox/issues/27.

This is inaccurate. Sophox/sophox#27 is tracking a bug that Sophox has been omitting many OSM elements, mostly nodes. It is also behind on ingesting OSM planet data overall. However, it’s up-to-date with respect to data items, modulo a potential problem with dropping older data items: Sophox/sophox#31. I don’t think this seeming regression should be a major factor in taking an irreversible step like discontinuing data items.

I think ideally we could switch to some Wikibase alternative that better fit our use case but I don't think there is any.

The main alternative is Semantic MediaWiki, but introducing it would make Wikibase feel like a walk in the park, not only for @Firefishy but also for wiki contributors and data consumers.

Can you explain what this knowledge base is and how people can use it

https://wiki.openstreetmap.org/wiki/Data_items is a good starting point. If you find anything missing there, please ask on the talk page.

flacombe commented 1 year ago

Dear all,

I second comments against loosing the knowledge and practices that came with wikibase. All we need is to improve it, not abandon it.

Lastly the current state of the data items is quite a mess because there are absolutely no mechanisms in place to synchronize template data with the data items ... it's all done manually

And there shouldn't be any. Tags structured data should be put once in data items with no redundancy in wiki. Redundancy is a temporary bridge, waiting for a more robust integration. It seems this had last long enough to be a valid point to abandon the whole architecture, this is not good. Let's finish the work prior to challenge it in the middle of the journey.

UX on wikibase editor has been criticized for years with not so much involvement to make it better. This is huge work, but : How can we achieve moving the whole documentation on a brand new system (we don't know yet) OSM community will manage on its own, if we're unable to improve existing and common tools?

tomhughes commented 1 year ago

The problem with wikibase is basically as Firefishy said, that it requires significant configuration to make it work and that configuration frequently changes and is poorly documented.

As far as I know very people people (other than wikidata obviously) actually run wikibase and as such it's not well tested unless you happen to be running the exact same bleeding edge version as wikidata - also they run in a separate mediawiki instance while we try and run it all in the one instance.

The result is that it frequently breaks when we upgrade mediawiki and I have to ask Yuri for help and he has to ask the wikidata people and they mostly shrug and say they have no idea if the release version we're using will work and it's all just a shit show.

not-my-profile commented 1 year ago

https://github.com/Sophox/sophox/issues/27 is tracking a bug that Sophox has been omitting many OSM elements

Ah yeah my bad ... I got confused by the different Sophox issues.

However, it’s up-to-date with respect to data items, modulo a potential problem with dropping older data items:

I do not think that it's up to date with regards to data items. E.g. this query yields nothing despite the data item existing for 11 days.

The main alternative is Semantic MediaWiki,

I do have experience with SMW and am certain that it's not an improvement over Wikibase.

All we need is to improve it, not abandon it.

I don't think that we can address the limitations of the Wikibase software (such as numeric ids being required).

Tags structured data should be put once in data items with no redundancy in wiki.

This does not work because taginfo does not support it and the taginfo maintainer is not interested in supporting it. Hence the redundancy. And I can understand their point because the current data item situation really is a mess.

How can we achieve moving the whole documentation on a brand new system (we don't know yet) OSM community will manage on its own, if we're unable to improve existing and common tools?

Migrating data to a compatible system would be easy ... it's just that there is no such suitable system afaik. Some system written specifically with OSM in mind could work much better for us. You cannot adapt a system if your use case is not in the scope of the project ... unless you fork the project but Wikibase is way too complicated to be forked.

Even if we do not see a solution to the situation right now, I think it's good for us to have this discussion.

flacombe commented 1 year ago

I got the point about maintenance and updates. I'll try to forward it to Wikimedia people we know at OSM France chapter.

As far as I know very people people (other than wikidata obviously) actually run wikibase

Wikidata + OSM is still twice more instances than OSM itself on a custom software.

I don't think that we can address the limitations of the Wikibase software (such as numeric ids being required). How the numeric id is a limitation?

Can't the items be reach through P19 https://wiki.openstreetmap.org/wiki/Property:P19 ?

This does not work because taginfo does not support it and the taginfo maintainer is not interested in supporting it. Hence the redundancy. And I can understand their point because the current data item situation really is a mess.

How could that change if we don't manage to change ourselves?

Some system written specifically with OSM in mind could work much better for us

Shouldn't we address needs on core osm.org website or API instead? No one could provide our API or core software but that would be clever to take advantage from Wikipedia experience.

You cannot adapt a system if your use case is not in the scope of the project ...

Come on, can't we talk to be part of this scope?

not-my-profile commented 1 year ago

How the numeric id is a limitation?

We do not need numeric ids but we have to use them ... that's a limitation.

Can't the items be reach through P19?

Yes of course but it's very much an unnecessary layer of indirection / source of confusion/inconvenience.

Shouldn't we address needs on core osm.org website or API instead?

This is not an either or scenario. I do consider documentation to be a very important topic.

Come on, can't we talk to be part of this scope?

Have a look at T202676, where a comparatively very small (but useful) change was rejected. Numeric ids on the other hand are fundamental to Wikibase, this is not something that can be easily changed and I am very certain that this won't change ... so this is not really a talking matter.

fititnt commented 1 year ago

@tomhughes @Firefishy There's any technical reasons to the Wikibase central point not be a dedicated place, let's say, https://base.openstreetmap.org ? Not saying about the name, but mostly to simplify infra, because is possible to both not be same instance.

This doesn't solve 100% of the obvious challenge (which I've already aware by the issues with upgrading the OpenStreetMap Wiki; others please read at least this https://github.com/openstreetmap/operations/issues/760#issuecomment-1280802385) however even Wikipedia have an dedicated Wikibase (https://www.wikidata.org/) installation to isolate the chaos from all the wikis. This somewhat means that while the OSM use of Wikibase is actually more likely to be harder to maintain (including backup size, scalability issues, etc) than Wikipedia's approach.

And like I wrote on the Wiki:Talk, I agree that this issue should be raised as affecting the smoother upgrades on the Wiki.

not-my-profile commented 1 year ago

Wikipedia has a dedicated Wikibase instance because there are many different Wikipedia instances (e.g. en.wikipedia.org, de.wikipedia.org, es.wikipedia.org etc.). I am pretty sure that moving the Wikibase instance of the OSM to a different domain would not solve or improve anything. The wikibase client extension would still need to be kept in sync with the wikibase repository extension on the different instance. Even worse this would mean you have different authentication sessions etc ... so definitely no.

1ec5 commented 1 year ago

The result is that it frequently breaks when we upgrade mediawiki and I have to ask Yuri for help and he has to ask the wikidata people and they mostly shrug and say they have no idea if the release version we're using will work and it's all just a shit show.

There is an uphill climb for third-party installations of MediaWiki, but for what it’s worth, Wikimedia’s own sister projects have often faced the same hurdles when it comes to maintaining extensions that Wikipedia doesn’t use.[^sister] Local chapters such as Wikimedia Deutschland and Wikimedia Italia have been instrumental to supporting non-Wikipedia use cases, so I applaud @flacombe’s outreach to Wikimedia France.

Perhaps the OSMF should explore joining the Wikibase Stakeholder Group, which sponsors improvements to Wikibase to make it more reusable independently of WMF infrastructure. Judging from the member list, we aren’t alone in needing this support structure. In parallel, the Wikibase Community User Group advocates for increased reusability and is open to individual participation.

however even Wikipedia have an dedicated Wikibase (https://www.wikidata.org/) installation to isolate the chaos from all the wikis

The reason Wikidata exists is to centralize data from myriad Wikimedia wikis, not to isolate Wikibase from those wikis. Quite the contrary: all the other wikis have Wikibase Client installed, and Wikimedia Commons has even overlaid another Wikibase instance to track structured data about media files.

Separating Wikibase from the main wiki wouldn’t by itself reduce maintenance overhead. There are hosted solutions like wikibase.cloud, but even if we manage to outsource Wikibase, I think the OSM Wiki would still need to install Wikibase Client and get existing data consumers to transition over to the new instance. Being a client of a separate Wikimedia Commons media repository has been its own headache…

[^sister]: For example, Proofread Page is what keeps the lights on at Wikisource, but at one point it fell into such disrepair that I and some other Wikisource contributors had to rush patches to fix parts of the site. Fortunately, things are much better with this extension these days, but I think the lesson is to consider MediaWiki like any other open-source project depending on volunteers to keep the software operational.

frodrigo commented 1 year ago

Structured and semantics documentation is a next step for OSM project. But losing the dataitem without replacement look to me like a step backward.

There is probably not a lot of users of dataitem, yet. But I personally I have many project of using it (pro and hobbies like Osmose-QA). What is missing from my point of view, is a working and up to date Sophox instance.

I am already in touch with @nyurik to “reboot” the Sophox instance. It is already on the way.

Removing dataitem will be a big loose for the future.

Firefishy commented 1 year ago

I'd feel a lot more comfortable if we were to split off wikibase to a separate instance of mediawiki, splitting it from wiki.openstreetmap.org.

nyurik commented 1 year ago

@Firefishy there is very little point in splitting wikibase because it consists of two parts -- server and client, and the client must live in the osm wiki to be of any reasonable use -- which means you would actually increase the number of problems (cross-site integration) rather than keeping it relatively simple. This is exactly how it is done with all other wiki installations that chose to use wikibase. It was also mentioned by a few other people above.

Firefishy commented 1 year ago

@Firefishy there is very little point in splitting wikibase because it consists of two parts -- server and client, and the client must live in the osm wiki to be of any reasonable use -- which means you would actually increase the number of problems (cross-site integration) rather than keeping it relatively simple. This is exactly how it is done with all other wiki installations that chose to use wikibase. It was also mentioned by a few other people above.

I mean, run as a completely separate unconnected instance of mediawiki. No connection between base.osm.org (eg) <-> wiki.osm.org.

lectrician1 commented 1 year ago

@Firefishy did you try upgrading the wiki to 1.39 and Wikibase produced errors? Is this why this issue exists?

tomhughes commented 1 year ago

We haven't even gone to 1.38 yet (when I looked a week or so back 1.39 wasn't out yet) because I got burnt so badly by the last upgrade that I can't face trying to do it again. We're still on 1.37 for the main wiki.

tomhughes commented 1 year ago

In fact 1.39 is still not out, but we do want to move to it once it is, for PHP 8 support.

matkoniecz commented 1 year ago

And there shouldn't be any. Tags structured data should be put once in data items with no redundancy in wiki.

Note that it should be proposed, discussed and agreed on OSM Wiki before actually doing this.

when you search for a key that doesn’t have an article yet or land on a 404 page for that key, the wiki displays an infobox synthesized from that key’s data item. If it’s a compound key, the 404 page also includes a breakdown of the key based on the data item of each component.

that is quite useful! Though all other mentioned uses are not actually requiring data items or wikibase

nyurik commented 1 year ago

@tomhughes was the previous migrate issue came about because the upgrade was done in production without testing it on the staging servers first? As far as I know, that was the only incident with Wikibase. Or is that a different issues?

tomhughes commented 1 year ago

We don't have a staging server...

The recent incident that happened with the 1.37 upgrade led to https://github.com/openstreetmap/chef/commit/ec0bc46d1275fc39a116736fc40372b0f6784fd0 as my first attempt to fix it and then a few days later after I got you involved https://github.com/openstreetmap/chef/commit/586ac89854fe37ae4d9a0b8dcda6797535cafb49 which was what finally got it working again.

I think there was at least one previous occasion where I had to get your help after an upgrade but I can't recall the details.

matkoniecz commented 1 year ago

If Wikibase is too hard to manage the most sensible alternative would be to remove tag documentation from the wiki entirely and move it to a new custom website.

Wiki worked before data items were added and will continue working (with some very minor functionality missing and some other functionality replaced by other solutions) in case of Wikibase being disabled.

I get that some people invested a lot of effort into data items but they are not irreplaceable. Please do not present OSM Wiki shutdown as a possible consequence. That is simply misleading.

I think we'd definitely need to migrate the tag description translations somewhere else ... IMHO regular wiki pages are definitely not an option ...

Regular wiki pages and preset translations fulfil well this needs, data item translations are of a very dubious use.

Quality is dubious, existing uses can be easily replaced by extracting data from infoboxes and serving that as an API (maybe there would be greater delay in updates, what would be worth lower risk of definitions being changed without any explanation and oversight)

iD consumes data items through the MediaWiki API. It’s a public API, so we don’t know for sure what other software (QA tools?) rely on it for descriptions, images, or statements about which element types are valid for a given tag. A breaking change of this magnitude needs to be discussed broadly, in a similar manner as if taginfo or Nominatim were to introduce a breaking change in its API.

Definitely. I posted info at OSM Wiki to notify interested people ( https://wiki.openstreetmap.org/wiki/Talk:Wiki#Proposal_to_remove_data_item_for_technical_reasons ).

flacombe commented 1 year ago

Note that it should be proposed, discussed and agreed on OSM Wiki before actually doing this.

This was said from a state of the art IT perspective. Every change will be discussed prior to be released.

existing uses can be easily replaced by extracting data from infoboxes and serving that as an API

This doesn't solve the inconsistency in translations issue the wikibase is intended to solve. Here is the only place I know where crawling inconsistent text is preferred over a structured database (wikibase or other one). Current taginfo already offers this API but fails in restoring the consistency (it should be restored by people with appropriate tools actually). This is not said to blame maintainers who achieve great things but the messy architecture we collectively encourage.

1ec5 commented 1 year ago

We don't have a staging server...

Upgrading in production without staging first sounds like a deeper problem. Today the obstacles are Wikibase and MultiMaps; tomorrow it may be something else in our configuration unrelated to these extensions. A staging server won’t magically fix extension incompatibilities, but without this important workflow, any issue becomes a fire drill, and the mere threat of such fire drills leads to deletionist proposals.

Regular wiki pages and preset translations fulfil well this needs, data item translations are of a very dubious use.

How can we be confident that this is not a minority opinion?

Definitely. I posted info at OSM Wiki to notify interested people.

You’re more optimistic than I am that every consumer of the OSM Wiki’s MediaWiki API instance follows the Talk:Wiki page or this repository. Given Hyrum’s law (relevant xkcd), it’s inevitable that something would be broken in any transition of data items off the wiki, no matter whom we contact; it’s only a question of the extent to which we care. But as a courtesy, there probably should be a more visible heads-up on one of the mailing lists, ideally focused on the immediate technical issues – @Firefishy’s plea for help – rather than running a victory lap around data items.

tomhughes commented 1 year ago

Well are we expected to run a second staging server for every service we run?

I don't really think it's practical to get rid of wikidata at this point. I do wish I'd never let Yuri persuade me to add it though.

matkoniecz commented 1 year ago

This doesn't solve the inconsistency in translations issue the wikibase is intended to solve.

Maybe wikibase was intended to solve it, but is not solving it at all but makes it worse (no edit descriptions, ineffective watchlisting result in problem being even worse)

structured database

infoboxes are also structured and wikibase is inferior in many ways, not only in how much it is annoying for sysadmins but also in interface quality and broken watchlisting

How can we be confident that this is not a minority opinion?

Not very sure. But in general data items are doomed to have low quality translations due to inferior editing interface (yes, it manages to be worse for data quality than even editing parameters of infobox)

flacombe commented 1 year ago

wikibase is inferior in many ways, not only in how much it is annoying for sysadmins but also in interface quality and broken watchlisting

Glad we didn't get rid of any annoying OSM interface this way prior to improve it. Rome wasn't build in one day, they're waiting for our pull requests.

fititnt commented 1 year ago

Fact: the Wiki people and the Wikibase people have near irreconcilable points of view about the structure of the same information. Something similar happens on Wikidata vs Wikipedia. Likely no consensus will be archived on a regular basis.

So, another (and maybe actually more relevant reason) to split is not just infra scale better, but to allow different self governance. What is called "data items" actually are used not just by the wiki, but by others, such as developers who would use it for translations of terms in their interfaces. Also, the OSM Wikibase currently is highly underutilized.

I don't think people agree to give up. So this is less the issue. However, for the Wikibase people here: why a dedicated place under some OSM domain would be bad?

The Wiki can have client enabled (and worst case, NGinx/Apache can redirect, so the URLs still work; but this Wiki decision) however this change could allow decisions that affect the semantic not be a pet project of the policy of the Wiki. Not even Wikipedia does this. Also, the OpenStreetMap Ops give some signal that splitting in different domains already could be an option.

1ec5 commented 1 year ago

I don't think people agree to give up. So this is less the issue. However, for the Wikibase people here: why a dedicated place under some OSM domain would be bad?

I appreciate your effort at finding middle ground. However, Wikibase Client is not guaranteed to be any less of a maintenance burden when upgrading, as it’s designed to be compatible with Wikibase. @Firefishy has floated the idea of not even installing Wikibase Client on the OSM Wiki: https://github.com/openstreetmap/operations/issues/764#issuecomment-1297253138. This is more extreme than Wikipedia, which you cite as precedent. It doesn’t answer the question of how the wiki will continue to source information that it’s currently getting from data items (such as in infoboxes and Map Features).

Jettisoning extensions is not a straightforward long-term strategy for streamlining MediaWiki upgrades, because there are tradeoffs and it won’t stop at Wikibase. (Remember when we accidentally took down the wiki by installing Scribunto?) “Wikibase people” have been willing to acknowledge that Wikibase comes with maintenance overhead, but conversely there should be acknowledgement that there is a legitimate need for what the status quo has been providing, for all its warts.

What we need to move forward is a clear explanation of what will break if we upgrade to MediaWiki 1.38 or 1.39 for #760. This will enable us to identify solutions or seek help from experts. Ideally we’d be running a staging server, but even a local dry run should be able to turn up whatever is causing us lost sleep. Then we can open issues about it in the relevant bug trackers. The energy currently devoted to this issue would be much more effective if directed at the Wikibase developers, once we have something concrete to point to. Our problems aren’t really on their radar yet. Let’s save the blue-sky ideas about a novel replacement until we’ve exhausted our options.

nyurik commented 1 year ago

Thanks @1ec5, I have been thinking along the same lines.

After reading through all the comments, it sounds like the main problem is not the wikibase itself, but devops scaling.

The devops demand has grown -- more software, with some becoming unmaintained -- but the (amazing!) current devops team really needs help, yet doesn't have an organized way to scale up.

I would love to help. I have been in devops with the "keys to the castle" in a 3000+ ppl company for many years, so I do have some experience, but I currently don't see any path for me to help - there is no clear way to on-board new devops volunteers, there is no process to try to set up staging systems so that the upgrade becomes more predictable and peaceful, there is no way to try new approaches -- e.g. dockers / kubernetes / maybe even cloud-based deployments with terraform+helm -- all those things enjoyed by many startups and fortune500s alike.

I am not saying these techs is what's needed. What's needed is a way to reliably onboard new people, and allow them to be productive. Otherwise something will continue crashing, and relying on just a few dedicated individuals is not what any reasonable long-term effort should do.

dpriskorn commented 1 year ago

wikibase is inferior in many ways, not only in how much it is annoying for sysadmins but also in interface quality and broken watchlisting

Glad we didn't get rid of any annoying OSM interface this way prior to improve it.

Rome wasn't build in one day, they're waiting for our pull requests.

I agree the Wikibase UI is still lacking now ten years into the project. I would very much encourage this community to clearly report any UX problems we experience. To my knowledge a core problem with Wikibase development is a total lack of usability testing (I have asked them to do it but have not seen anything done yet). We could do a couple of recordings with user testing and report back our findings to WMDE.

GreenReaper commented 1 year ago

Not all third-party installs use a 'combined' Wikibase. It might not be a bad idea to split it into a separate wiki, as at least Repository-specific failures would not directly break the main wiki, and there are ways to reduce the pain (shared tables or CentralAuth); however it might also increase the maintenance workload, and moreover I am not sure it is even practical at this point.

Not only would certain tables need to be moved, the actual entities are stored as JSON documents in the main database tables (with secondary storage in the aforementioned tables for Wikibase's needs). They are at least in their own namespaces, but still - it'd involve slicing up tables, and I don't know that anyone has done it before. It might be easier to just export and import relevant namespaces, with a new WDQS as well. The concept URI would also change, although the external impact of this may be limited if you do not have many external users of such data.

It you do split it, I'd advise staying within the same database cluster, otherwise integration options such as client editing will be limited to options like UnlinkedWikibase - though improving this has been a desire for a while.


As for the question of staging, a separate environment might be a first start, either on the same server (but separate domains and PHP-FPM pools) or perhaps somewhere in the cloud. The free instances on offer from Oracle or Google may be plenty, since you don't need to have it process serious traffic.

angoca commented 1 year ago

My point of view on removing the Data items part from the wiki could affect many people who had worked on this part, correcting many things, improving texts, etc. They will see this removal as if their long working hours were for nothing, and they will probably be discouraged from continuing contributing to the Wiki or even to OSM. The wiki is our knowledge base, and it has grown in different dimensions thanks to the collaboration of many volunteers; we cannot cut one part just because of some technical issues.

As a database expert, I see the values in the KeyDescription, and ValueDescription templates in regular wiki pages for the same key/value in different languages as redundant, with many inconsistencies among them. And we know redundancy without synchronization is a bad practice that leads to incorrect values; therefore, the mappers unknown the absolute truth. I have modified thousands of regular wiki pages (literally) trying to standardize just one element of the tags, the "status," which is different in some languages for many entries. And this is a never-ending task because there is nothing centralized. In this case, what is the purpose of having the chance to put a different "status" value other than the one in English? The status is unique for a key or value, no matter the language on the wiki page is written. And this problem applies to many options in the KeyDescription and ValueDescription.

Screen Shot 2022-11-03 at 10 09 38 PM

I perceive the data items like a normalized database with a single possible value for a specific element. Instead of removing the data item functionality, we should do the opposite and encourage its usage. I know the learning curve is complex, and there are not too many people working on it, but we can write more articles about how to contribute and maintain this part to welcome new volunteers in this area. We also must remember that with the increasing number of new tags and translations, maintaining the regular wiki articles will be even more complicated, and they will be even more disorganized. Instead, if we start to migrate and standardize values in different languages to a unique value in the data item, the maintenance will be easier by keeping up to date with just the English version and the data item.

I said all of this from my perspective as a Spanish speaker that always finds incongruences in articles in Spanish, but I can read English and French; however, this is not always the case for all Spanish speakers. Thus, if the documentation is wrong in Spanish, the wiki is transmitting some incorrect information to the mappers, and then the map will be affected. Data items could be the solution to keep the documentation updated without too much work.

valeriewmde commented 1 year ago

Hey everyone,

Some days ago, a community member shed light on the current state of OSM and the conversation here on GitHub. I forwarded the message to our team saying that this was something we should give immediate attention to.

We've gathered different perspectives and voices on how to best approach this and came up with some suggestions and ideas.

First of all, we hear you. The upgrade process you went through is indeed frustrating and overly complex. We really appreciate the time you took to voice and discuss your concerns. Your feedback is vital to our development process. As we improve the installation and upgrade process for Wikibase, we’ll definitely keep your observations in mind.

When we look at your situation more closely, we think you would meet with a lot more success and a lot less frustration if you began using Wikibase Suite. Suite is a new product of ours (Wikimedia Deutschland) that makes installing and upgrading Wikibase easier. We do this by providing a package of Wikibase components that play well together. Particularly relevant to OSM’s instance, one can simply grab the extension to use on its own. Unfortunately, when you went to upgrade to Mediawiki 1.37, there was no Suite-equivalent 1.37 version, so you had to do it the hard way. We feel your pain.

Other than that, we’d like to give you some insight into what we’re currently working on and how that relates to your current problems.

Syncing release cycles with Mediawiki:

We’re in the process of syncing our Wikibase Suite release cycle with Mediawiki’s. We recently released a version compatible with 1.37, and we’re in the middle of releasing 1.38, which is planned for Q1 2023.

What’s included in Wikibase Suite:

Wikibase Suite releases contain extra testing, cross-component testing, release notes, upgrade notes, packaging and documentation changes, all of which can be found on medaiwiki.org etc. For example, here’s the 1.37 compatible release of Suite announced earlier this month: https://lists.wikimedia.org/hyperkitty/list/wikibaseug@lists.wikimedia.org/thread/RWS6EV7SHFNOD6KKQ6JA7RUW4TEXSXN5/

In particular, the release notes and documentation for Suite releases might fit your current situation: https://gerrit.wikimedia.org/r/plugins/gitiles/mediawiki/extensions/Wikibase/+/refs/heads/REL1_37/RELEASE-NOTES-1.37 (We think this will address most issues identified in https://github.com/openstreetmap/operations/issues/611, but more feedback is very welcome.)

We’re also working on providing more sensible default settings, for example: The tar files created as part of a Suite release also come with submodules and dependencies packaged up. For a Wikibase with default settings we are almost at the point where you just load the code, add 1 line to LocalSettings.php and run update.php to have the basic system working

Beyond that, we’d like to follow up on some questions and comments that came up in this conversation.

@not-my-profile

Lastly the current state of the data items is quite a mess because there are absolutely no mechanisms in place to synchronize template data with the data items ... it's all done manually

We’d love to hear more about this. What exactly is being synced with what here?

@flacombe

broken watchlisting

Please expand on this! We’d like to know more. What exactly is broken?

Our engineering team has already started looking into some of your questions.

At this point, I’d like to hand over to @addshore here, who came up with some solutions:

@tomhughes

configuration frequently changes and is poorly documented

@addshore: Indeed there are often configuration changes in a release. These are however noted in the release notes that are written as part of suite, you can trivially grep to see if you use removed ones etc. All configuration options should be documented and if things are missing or unclear we are happy to improve these https://doc.wikimedia.org/Wikibase/master/php/docs_topics_options.html for specific releases at https://doc.wikimedia.org/Wikibase/REL1_37/php/md_docs_topics_options.html for example

@tomhughes

it's not well tested unless you happen to be running the exact same bleeding edge version as wikidata

@addshore: There are a multitude of tests for common use cases. Suite builds on this with cross component testing.

@tomhughes

also they run in a separate mediawiki instance while we try and run it all in the one instance.

@addshore: I'd recommend against OSM running a separate Wikibase instance and also trying to have it connected to the main OSM wiki. I think other folks in this thread already highlighted some of the reasons this could be a less ideal solution.

We hope that this response clarifies some of your concerns. We’re very much looking forward to continuing the conversation with you. Cheers!

The Wikibase Suite Team

tomhughes commented 1 year ago

We don't currently install from tar files, partly because that is a pain to automate, and partly because we were trying to track point releases automatically without needing a human to go in and bump the version number in chef.

So currently we install from git, but that does cause us a lot of additional pain on top of any wikibase issues as it's not really supported so it probably is time to change the way we handle those things.

The plan however, after discussing things with @nyurik on our ops call last week, was to try looking at dockerising things.

I find those release notes interesting for sure and I can see how they might have helped though I'm not sure I would have understood their importance as they read as if the php script is offering optimisation suggestions rather than that there are significant changes to how configuration works and that you need to run it to translate your current configuration to the new format - or at least I think that is what that script does?

Also it's useful to tell me federated properties won't work, except that I have no idea what they are or if we use them.

1ec5 commented 1 year ago

Also it's useful to tell me federated properties won't work, except that I have no idea what they are or if we use them.

We aren’t using federated properties. The feature sounds like it would be useful for some sites out there, but it’s mutually exclusive of local properties, which is the whole point of the data items on the OSM Wiki.

GreenReaper commented 1 year ago

Federated Properties are described in detail here.

For what it's worth, significant work was done on a Federated Properties v2 which lets you add properties accessed via the API, rather than having one wiki's properties replace yours (internally, instead of just P1, it has the full URI to the remote entity). In this way you could have your local cake and eat it remotely, too.

There is work to be done on it and I think there was insufficient development bandwidth to complete this and roll out Wikibase.cloud at the same time.

Per comments accompanying the 1.37 release, "a new experimental Federated Properties version will be included in the next Wikibase Suite release, compatible with Mediawiki 1.38."

matkoniecz commented 1 year ago

@valeriewmde

We’d love to hear more about this. What exactly is being synced with what here?

In short: that is organisational issue caused by interface deficiencies and dislike of Wikibase (at least among some people). Including critical watchlisting UX resulting in lower quality.

And split between people who want to migrate to Wikibase (and spend massive effort on it) and people preferring to avoid it.

Lets take https://wiki.openstreetmap.org/wiki/Tag:natural%3Dbirds_nest page

It has properties specified as infobox parameters in OSM Wiki

{{ValueDescription
|key=natural
|value=birds_nest
|image=File:Rufous_hornero_(Red_ovenbird)(Furnarius_rufus)_and_nest_(2).JPG
|description= A bird's nest.
|group=natural
|onNode=yes
|onWay=no
|onArea=yes
|onRelation=no
|seeAlso=
* {{Tag|man_made|nesting_site}}
|status=in use
}}

It also has Q6219 data item (not to be confused with Q6219)

There were proposals to eliminate infobox parameters and import all of them from Wikibase, but it is rejected by some as Wikibase interface is highly dysfunctional in many ways, watchlisting changes is basically impossible and it increases editing complexity in many ways (yes, it also reduces it in other ways and wiki editing has also some interface issues). And some were sceptical about this new thing (as seen by this issue: it was at least partially justified)

matkoniecz commented 1 year ago

@valeriewmde

broken watchlisting

Please expand on this! We’d like to know more. What exactly is broken?

In case of wiki text I can watchlist https://wiki.openstreetmap.org/wiki/Tag:natural%3Dbirds_nest without watchlisting https://wiki.openstreetmap.org/wiki/Uk:Tag:natural%3Dbirds_nest

In case of watchlisting data item adding label in every single language will appear in my watchlist. That results in getting useless entries that I cannot verify Added [uk] description: Гнізда птахів may be correct, have slight mistake or be a slur - I lack language knowledge to judge it, so its appearance in my watchlist is useless.

I tried using data items and it caused my watchlist to become unusable. It become filled in 90% by edits in languages where I lack any ability to judge edit quality.

As result, data quality in data item descriptions is lower as watchlisting them is much harder and basically impossible on larger scale.

matkoniecz commented 1 year ago

Thus, if the documentation is wrong in Spanish, the wiki is transmitting some incorrect information to the mappers, and then the map will be affected. Data items could be the solution to keep the documentation updated without too much work.

not really, data items can help at most with some infobox properties like "is it expected to be used on nodes" and "tag status (in use/de facto/deprecated)"

It is actively worse with translating tag descriptions where content differs between languages and is making harder to handle longer description due to splitting editing between data item and wiki page

Instead, if we start to migrate and standardize values in different languages to a unique value in the data item, the maintenance will be easier by keeping up to date with just the English version and the data item.

So supporting data items is encouraging ban on translating wiki articles? That is not making me more supportive about Wikibase.

Data items cannot be used to handle article texts and not even trying to.

values in the KeyDescription, and ValueDescription templates in regular wiki pages for the same key/value in different languages as redundant, with many inconsistencies among them.

this can be also synchronised with bot edits

GreenReaper commented 1 year ago

Is there a way to get client editing working on OSM? That might help with the "editing in two places".

Wikidata users have made copious JavaScript tools to get around shortcomings in the official UI. Some of these might be adaptable to OSM Wiki. Stuff like WikidataWatchlistLabels and DiffLists might help.

AndrewHain commented 1 year ago

My comment to anyone who prefers the time before data items: this would take us back to the Verdy p era, with wiki work on the items starting just a week after he was blocked permanently. Over the past few years there has been a lot of effort to stop the bloat that he introduced from choking the wiki; some of that work uses data items and would be lost.

matkoniecz commented 1 year ago

I am confused why removing data items would require bringing back Verdy p or their inventions.

I removed/undone some of what they did and in no case that I encountered data items were useful or helpful. And all data item use on OSM Wiki that I know about is unrelated to their editing activities and cleanup of that.

with wiki work on the items starting just a week after he was blocked permanently

As far as I know that was utterly unrelated - is it some attempt to credit data items for their ban? (even if their reaction to data items was a final straw, that is utterly unrelated to usefulness of data items)

1ec5 commented 1 year ago

Is there a way to get client editing working on OSM? That might help with the "editing in two places".

Wikidata Bridge sounds like a good fit for our use case. Does it depend on Wikibase Client, or does it also allow editing items directly from within articles on wikis that integrate Wikibase directly? Back in 2019, @nyurik installed WE-Framework as an optional gadget on the wiki, but it was still a work in progress and apparently no longer works. If Wikidata Bridge is in better shape, I’d like to know what’s involved in setting it up. As an administrator, I can install any necessary changes to site scripts and protected templates, but extension-level changes would need the operations team’s involvement. Is it currently installed somewhere where we could try it out?

with wiki work on the items starting just a week after he was blocked permanently

As far as I know that was utterly unrelated - is it some attempt to credit data items for their ban? (even if their reaction to data items was a final straw, that is utterly unrelated to usefulness of data items)

Verdy_p was blocked for reasons unrelated to data items. I think @AndrewHain’s point is that data items have been an important tool for keeping the wiki’s templates manageable. The timing with Verdy_p was coincidental. (That said, Scribunto probably had a more direct impact on page performance. I sure hope we never uninstall Scribunto on the same rationale of upgrade complexity, because that really would take us back to the Stone Age.)

dpriskorn commented 1 year ago

@valeriewmde

broken watchlisting

Please expand on this! We’d like to know more. What exactly is broken?

In case of wiki text I can watchlist https://wiki.openstreetmap.org/wiki/Tag:natural%3Dbirds_nest without watchlisting https://wiki.openstreetmap.org/wiki/Uk:Tag:natural%3Dbirds_nest

In case of watchlisting data item adding label in every single language will appear in my watchlist. That results in getting useless entries that I cannot verify Added [uk] description: Гнізда птахів may be correct, have slight mistake or be a slur - I lack language knowledge to judge it, so its appearance in my watchlist is useless.

I tried using data items and it caused my watchlist to become unusable. It become filled in 90% by edits in languages where I lack any ability to judge edit quality.

As result, data quality in data item descriptions is lower as watchlisting them is much harder and basically impossible on larger scale.

FWIW I have the same issue in Wikidata. It would be nice to be able to filter based on language to avoid notification fatigue.

dpriskorn commented 1 year ago

WikidataWatchlistLabels

Source