Closed LecCory closed 1 year ago
Mediawiki can be installed via the commmand line or via an import, it wouldn't need to be installed via the web installer for our purposes.
Semantic Mediawiki is a set of extensions that builds onthe Mediawiki base, preserving all its features (edit history, markup syntax, extensions, transparency, etc).
I think we have to consider that Mediawiki is a lot more than "wiki software." Among other things, preserving the fully visible edit history and intention behind syntax are not things that should be dropped casually, rather they are requirements. Mediawiki is used for very large sites. I don't know if Docuwiki would be up to the task since it just uses flat files, and porting the site would lose a lot of meaning.
Using open source is one of the digital standards. Mediawiki has a large community behind it with very good support, and there are a number of companies that provide commercial support if required. Postgres is a very widely used database, with corporate support options. It is not a first choice for use with Mediawiki, but presumably this will only be a problem for exotic extensions. Semantic Mediawiki supports Postgresql.
OIDC is supported by a Mediawiki extension, so Azure AD (or any other OIDC provider) could be used for auth there too, as well as all applications of GCConnex going forward.
Does the install via command line also require that the database have SSL communication disabled in order to establish a connection? Because that's been my experience with getting it to work with Postgres. The application is running in a web app so accessing the command line is not something that I'm aware can be accessed outside of the web portal, I've tried. I'm also aware that the digital standards recommend using open source, where appropriate. It also recommends addressing security and privacy risks. So in keeping in line with the digital standards I addressed a security risk that would mean data in transit is not secured and therefore vulnerable. I'm not suggesting moving away from Postgres if we are to self host some wiki-like application. All I'm suggesting is that we explore options that align with the absolute minimum security baseline recommended by Microsoft as these baselines are what get taken into account when the GC does its own security assessments.
Yes, MediaWiki has a large community behind it but as indicated in their own documentation which I referenced
Support for Postgres is maintained by volunteers. See the discussion linked below, which has the OP of that thread attempting to use the same Postgres Azure PaaS we're attempting to use and requiring SSL to be enabled. This was almost 5 years ago, and there's still no option to enable SSL communication for Postgres in the MediaWiki install.
https://www.mediawiki.org/wiki/Topic:Uiicm84vyzrtmhzl
It seems to me that the community only supports core functionality for Postgres and nothing more. That is why I opened the discussion to explore our options. So we want to follow the Digital Standards, that's great lets find something open source, but be conscious of the security and privacy risks by following the baseline for Postgres security.
I'm not suggesting that we ignore any baseline for security (-: The proposed patch for this issue, for mediawiki 1.30, is not many lines changed in just a few files, so first step would be to explore where it's at. There are other options than patching, for example tunnelling and proxying. But I will start on seeing where the patch is at.
Past that, however, moving from Mediawiki to another "wiki" would be a very different task, since it's not just pages with html that is important, but also changes and contributors. [edit: patch applies to a few files]
Furthermore, the preferred and commonly used database for MediaWiki is MySQL which we are not authorized to use.
I think pushing back to IMTD one getting the managed MySQL would make more sense than re-platforming everything because they say they don't support it.
We landed on Postgres after many back and forth conversations with IMTD to try and get their buy-in to support MySQL. The options were our DBs gets hosted in their MSSQL cluster or we manage our own Postgres instance, and trying to stick with open source while aligning with IMTD Postgres was the only option. Trust me there was no lack of trying to push for MySQL. The issue now is I cannot get mediawiki to work with Postgres without disabling SSL communication. The solutions I was able to find to add support for SSL encryption would require a modification to a few core PHP files. These aren't officially part of the mediawiki build so if we were to ever update mediawiki we'd have to ensure our modifications aren't overwritten. This is a situation like with Elgg if we start making modifications to core components will we run into another situation where we can't update/upgrade?
I'm in conversation with @hexmode to see how we could get this into Mediawiki core.
From @bawolff, @hexmode
So it looks like part of that patch was upstreamed in 2e5d114a99cf1 [1] starting with MediaWiki 1.31 - where $wgDBssl makes the postgresql sslmode be "require" (vs the default of "prefer" if you do not set $wgDBssl. Prefer means to opportunistically use unauthenticated encryption, require means to force using still unauthenticated encryption [2]). From a security perspective both values are pretty useless, but I would expect either would be enough to make mediawiki "work". The secure value would be "verify-full" (Actually validate the certificate).
So is the requirement here that mediawiki sets the SSL mode to "verify-full" when $wgDBssl = true; ? [I also filed https://phabricator.wikimedia.org/T335617 about this]
It seems like part of the linked patch was also a way to make the installer use a secure connection. Is that required? The way the patch does it is with an environmental variable. That is probably not going to be able to be upstreamed, but adding a checkbox to the installer should be easy enough. Although I suspect in most cases the installer should complete even without it, as even if ssl mode isn't set, it is supposed to opportunistically use ssl if available.
The patch also has some stuff removing the @domain part of the username. I guess that's related to how MS ties AD authentication into postgres. I don't know enough about the subject to know if that part of the change makes sense generally, but it can probably be worked around by telling mediawiki not to create a new user during the install step.
Anyways, I guess what I'm asking here is what are the actual requirements of the secure environment that mediawiki is not currently meeting?
[1] https://gerrit.wikimedia.org/r/c/mediawiki/core/+/451710 [2] https://www.postgresql.org/docs/current/libpq-ssl.html#LIBPQ-SSL-SSLMODE-STATEMENTS
I don't think the installer has to have secure mode, we can create a bootstrap wiki locally. @LecCory would requiring the SSL mode to be "verify-full" when $wgDBssl = true; be adequate for our environment? Any comments on the domain removal?
If there's a way to have SSL mode on by default that works. The issue that I have is we need to stand up a fresh installation of MW using a PG backend that follows the baselines for security. Right now as mentioned the only way to stand up the app is to have data-in-transit on the whole PaaS instance disabled. This poses an issue not just for GCPedia, but any other database hosted off the PaaS. Turning it off allows me to get through the installation no problem, that's fine in a dev environment but it will never pass a security assessment and will never see a production tenant. We're going from a Maria/MySQL backed application, to PG so no matter how we slice it, we have to bring up a fresh MW instance and run through the installer. As for the domain, I don't think we can get around it. You have to pass the
I do not understand why you need to run the installer in your production environment.
You should be able to use the installer in your dev environment to create a DB schema and LocalSettings.php file that you can then move to your production environment.
If $wgDBssl = true
works outside the installer, this would mitigate the need to run the installer.
So I did a little test of MediaWiki with a local version of postgres configured to require SSL (setting only hostssl in pg_hba.conf).
Not really an expert at postgres, but I tested with the command line to make sure I had correctly setup postgres so that SSL was required:
bawolff@bawolff:/var/www/html/w$ psql 'postgresql://mwuser@localhost:5432/mediawiki?sslmode=disable'
psql: error: FATAL: no pg_hba.conf entry for host "::1", user "mwuser", database "mediawiki", SSL off
I then tried to run the installer, and the installer still ran fine. So I don't understand what is going on here - although of course it is possible that the require SSL setting in Azure cloud offerings is doing something different then the same setting does in a locally hosted database. $wgDBssl only really controls if SSL is required, not if its available. If the server only supports SSL, then SSL should be used regardless of that setting.
Can you confirm:
As an aside, for context, the biggest gotcha with postgres support in mediawiki, is it is tested less and not as performance optimized. There is CI that run the unit tests under postgres, but most MediaWiki devs and major sites use mariadb, so it doesn't get the same level of testing. Similarly basically all performance optimization work is done against mariadb since it is mostly wikipedia doing the performance optimization. Additionally Some extensions don't support postgres.
My apologies for the confusion, we would of course build out everything in development before promoting it to production. In my ramblings above, I probably mixed two thoughts together and made it unclear. My concern is with infrastructure, I have to make sure that applications that are built follow at minimum the security baselines outlined by Microsoft as we're hosting this in Azure as well as adheres to any ITSG33 controls our security team requires. That being said, before we promote anything from development where we would build this out as a proof of concept it has to be assessed and reviewed by these teams. I have to present a design and solution that is as airtight as possible. If you're telling me that I can standup a MW instance with PG as the DB without having to disable SSL settings on the DB instance, then great I'll do that and this case can be closed.
Regardless though, i think we should have a checkbox in the installer for db ssl, so i filed https://phabricator.wikimedia.org/T335828
Thanks folks for your help with this! Sounds like we could present a valid option with the current Mediawiki core source. @Phanoix any comments while we have people's attention?
btw, Mainframe98 added the checkbox. It is currently scheduled to be included in MediaWiki 1.41.
It looks like the release timeline has 1.41 scheduled for sometime in November 2023. I will close this as the intended purpose of looking into options may no longer be required. We will continue any additional development and experimenting with GCWiki/Pedia in other tasks
Upon reading and researching security baselines for Azure Database for PostgreSQL. I noticed that in order to get the latest Mediawiki release installed with Postgres, I had to disable the Enforce SSL Connection option which directly violates section 4.4: Encrypt all sensitive information in transit in the security baseline. There is no option in the install wizard when selecting Postgres to enable SSL.
Furthermore, the preferred and commonly used database for MediaWiki is MySQL which we are not authorized to use. Support for Postgres is maintained by volunteers so we would be at the mercy of volunteers to maintain any Postgres configurations
With the security issue identified and the reliance of community volunteers to maintain support, I believe we should consider other alternatives to MediaWiki itself.
I believe semantic-mediawiki was one alternative identified in our sprint planning meeting
Other alternatives include DokuWiki there's no database required for this, it is a mature application and there appears to be information on migrating form mediawiki to Doku
Another alternative is Confluence. This is a cloud based licensed application ($$$) by Atlassian. IMTD uses Jira, which is another Atlassian product so there may be existing assessments done that can be leveraged. For GCPedia, we could use Azure AD for sign-in. There are also migration steps that can be followed to move to this product.
This is a very surface level option analysis, and by no means are these the only options we should consider. This requires way more research and separate security assessments and I open this up for further discussion but I do strongly believe given the security implication from having to disable SSL communication and the reliance on volunteers to support Postgres configurations with MediaWiki we should explore our options.