enterprisemediawiki / meza

Setup an enterprise MediaWiki server with simple commands
MIT License
41 stars 27 forks source link

Shorten URL #336

Open darenwelsh opened 8 years ago

darenwelsh commented 8 years ago

Currently we have https://<domain>/<wiki_id>/index.php/Page_Name.

Try using a URL re-write rule to remove index.php/.

jamesmontalvo3 commented 8 years ago

This may work, or is at least a good starting point. .htaccess was:

RewriteRule ^(?!mediawiki(?:/|$))[^/]+(?:/(.*))?$ mediawiki/$1

Is:

RewriteRule ^(?!mediawiki(?:/|$))[^/]+(?:/(.*))?$ mediawiki/index.php/$1

See https://www.mediawiki.org/wiki/Manual:Short_URL

freephile commented 6 years ago

I've implemented Short URLs in the subdomain-support branch on (freephile/subdomain-support). Basically, you need to have $wgScriptPath set to the actual code e.g. 'mediawiki' or the more standard 'w'. AND you need $wgArticlePath set to wiki/$1, then you use a very simple rewrite rule. You also need an Alias in httpd.conf or else a rewrite (possibly in .htaccess, but httpd.conf preferred) to convert the virtual path (e.g. /demo) to a symbolic link to 'w'.

But the whole switch from path based to host-based configuration is still a bit fragile. (e.g. the XHR requests from WikiBlender generate CORS errors.)

Supporting both path-based and domain-based partitioning of the wiki farm is possible, but even more complicated in terms of the configuration and interdependency with other systems. For example, in my simple implementation, you can't have a 'foo' subdomain and a '/foo' path-based wiki at the same time. But with a complex config and router system, you possibly could.

I would like to make the Farm aspect even more robust like the WMF configs (and have studied those in great detail), along with SimpleFarm and MediaWikiFarm. Those extensions can't be incorporated into Meza because each has it's independent idea of how things should be configured. But we can implement a 'roll your own' solution that starts off simple and eventually could become as torturous as WMF

Aside: Notes on fixing the XHR problem before my browser crashes... Something like this needs to go into LocalSettings.php

// allow XHTTP requests
$wgCrossSiteAJAXdomains = array( '*.qualitybox.us' );

And then the code in WikiBlender needs to add 'origin' info.

You can use this little bit of js in the console to test.

$.ajax( {
    'url': 'https://it.qualitybox.us/w/api.php',
    'data': {
        'action': 'query',
        'meta': 'userinfo',
        'format': 'json',
        'origin': 'https://demo.qualitybox.us'
        // or whichever domain you're on; must be correct!
    },
    'xhrFields': {
        'withCredentials': true
    },
    'success': function( data ) {
        alert( 'Foreign user ' + data.query.userinfo.name +
            ' (ID ' + data.query.userinfo.id + ')' );
    }
} );
revansx commented 6 years ago

Hexmode's recipe for shortURLs requires changes to 3 files:

1. src/roles/htdocs/templates/.htaccess.j2 2. src/roles/mediawiki/templates/LocalSettings.php.j2 3. src/roles/parsoid-settings/templates/localsettings.js.j2

as follows:

.htaccess.j2

diff --git a/src/roles/htdocs/templates/.htaccess.j2 b/src/roles/htdocs/templates/.htaccess.j2
index fcebf0d..4eaf935 100644
--- a/src/roles/htdocs/templates/.htaccess.j2
+++ b/src/roles/htdocs/templates/.htaccess.j2
@@ -30,10 +30,15 @@
     RewriteRule ^BackupDownload(?:/|$)(.*)$ - [L]
     {% endif %}

-    # Taken from MediaWiki.org [[Extension:Simple Farm]]
-    #
-    # Redirect virtual wiki path to physical wiki path. There
-    # can be no wiki accessible using this path.
-    RewriteRule ^(?!mediawiki(?:/|$))[^/]+(?:/(.*))?$ mediawiki/$1
+    RewriteCond "%{REQUEST_FILENAME}" !-f
+    RewriteCond "%{REQUEST_FILENAME}" !-d
+    RewriteRule ^([^/]+)$ /$1/ [R=301]

+    RewriteCond "%{REQUEST_FILENAME}" !-f
+    RewriteCond "%{REQUEST_FILENAME}" !-d
+    RewriteRule ^mediawiki/([a-zA-Z-]+)/(.*) mediawiki/$2?MEZAWIKI=$1 [PT,QSA,E=WIKI:$1]
+
+    RewriteCond "%{REQUEST_FILENAME}" !-f
+    RewriteCond "%{REQUEST_FILENAME}" !-d
+    RewriteRule ^([a-zA-Z-]+)/(?!mediawiki/)(.*) mediawiki/index.php/$3?MEZAWIKI=$1 [PT,QSA,E=WIKI:$1]
 </IfModule>

LocalSettings.php.j2

diff --git a/src/roles/mediawiki/templates/LocalSettings.php.j2 b/src/roles/mediawiki/templates/LocalSettings.php.j2
index 3066494..e925764 100644
--- a/src/roles/mediawiki/templates/LocalSettings.php.j2
+++ b/src/roles/mediawiki/templates/LocalSettings.php.j2
@@ -43,21 +43,10 @@ require "{{ m_deploy }}/samlLocalSettings.php";

 require '/opt/.deploy-meza/config.php';

-if( $wgCommandLineMode ) {
+$mezaWikiEnvVarName='WIKI';

-   $mezaWikiEnvVarName='WIKI';
-
-   // get $wikiId from environment variable
-   $wikiId = getenv( $mezaWikiEnvVarName );
-
-}
-else {
-
-   // get $wikiId from URI
-   $uriParts = explode( '/', $_SERVER['REQUEST_URI'] );
-   $wikiId = strtolower( $uriParts[1] ); // URI has leading slash, so $uriParts[0] is empty string
-
-}
+// get $wikiId from environment variable
+$wikiId = isset( $_GET['MEZAWIKI'] ) ? $_GET['MEZAWIKI'] : getenv( $mezaWikiEnvVarName );

 // get all directory names in /wikis, minus the first two: . and ..
 $wikis = array_slice( scandir( "$m_htdocs/wikis" ), 2 );
@@ -230,7 +219,9 @@ else {
 $wgServer = 'https://{{ wiki_app_fqdn }}';

 // https://www.mediawiki.org/wiki/Manual:$wgScriptPath
-$wgScriptPath = "/$wikiId";
+$wgScriptPath = "/mediawiki/$wikiId";
+$wgArticlePath = "/$wikiId/$1";
+$wgUsePathInfo = true;

 // https://www.mediawiki.org/wiki/Manual:$wgUploadPath
 $wgUploadPath = "$wgScriptPath/img_auth.php";

and src/roles/parsoid-settings/templates/localsettings.js.j2 (or config.yaml in MW 1.30) goes from:

        # {{ wiki }}
        - # uri = the URL of the MediaWiki API endpoint

          {% if groups['app-servers']|length|int == 1 and groups['parsoid-servers']|length|int == 1 and groups['app-servers'][0] == groups['parsoid-servers'][0] -%}

          uri: 'http://127.0.0.1:8080/{{ wiki }}/api.php'

to

        # {{ wiki }}
        - # uri = the URL of the MediaWiki API endpoint

          {% if groups['app-servers']|length|int == 1 and groups['parsoid-servers']|length|int == 1 and groups['app-servers'][0] == groups['parsoid-servers'][0] -%}

          uri: 'http://127.0.0.1:8080/mediawiki/{{ wiki }}/api.php'

that will implement shorturls using the wikiId.

freephile commented 5 years ago

I have not succeeded in creating short urls for Meza. Whether using Mark's code, my sub-domain code or the manual. It would be appreciated if someone had the interest to take a look at this. (WIP pull request just submitted to see build failures)

I've tried the 'short url' advice given in the manual

# Short URL for wiki pages
# RewriteRule ^/?demo(/.*)?$ %{DOCUMENT_ROOT}/w/index.php [L]
# RewriteRule ^/*$ %{DOCUMENT_ROOT}/w/index.php [L]

I tried the Alias directive mentioned in the Apache Short URL manual

I've tried Mark's code:

    # Redirect virtual wiki path to physical wiki path. There
    # can be no wiki accessible using this path.
##### REPLACE THE FOLLOWING LINE
#    RewriteRule ^(?!mediawiki(?:/|$))[^/]+(?:/(.*))?$ mediawiki/$1

    RewriteCond "%{REQUEST_FILENAME}" !-f
    RewriteCond "%{REQUEST_FILENAME}" !-d
    RewriteRule ^([^/]+)$ /$1/ [R=301]

    RewriteCond "%{REQUEST_FILENAME}" !-f
    RewriteCond "%{REQUEST_FILENAME}" !-d
    RewriteRule ^mediawiki/([a-zA-Z-]+)/(.*) mediawiki/$2?MEZAWIKI=$1 [PT,QSA,E=WIKI:$1]

    RewriteCond "%{REQUEST_FILENAME}" !-f
    RewriteCond "%{REQUEST_FILENAME}" !-d
    RewriteRule ^([a-zA-Z-]+)/(?!mediawiki/)(.*) mediawiki/index.php/$3?MEZAWIKI=$1 [PT,QSA,E=WIKI:$1]

Note that in Mark's last rule, the $3 should actually be $2 because the the look-ahead assertion is a non-capturing pattern.

Mark's essential code (not related to wikiId or Parsoid -- which are also important) includes changing a line in LocalSettings.php too: namely changing $wgScriptPath to be defined as the physical installation path from Document Root

$wgScriptPath = "/mediawiki/$wikiId"; 
$wgArticlePath = "/$wikiId/$1";
$wgUsePathInfo = true;
freephile commented 5 years ago

I finally did get an example working. Will submit a patch for review.

revansx commented 5 years ago

what was the trick? I've been running Mark's code for a year at it has seemed fine. What was wrong with Mark's code?

jamesmontalvo3 commented 5 years ago

You’ve been running on a fork for a year? Why not submit the change?

revansx commented 5 years ago

I don't understand. I thought the change was submitted and we're all waiting for it to be accepted in meza. What am I missing? is the pull request not there? if not, I'll submit it.

jamesmontalvo3 commented 5 years ago

I'm not aware of any PR that passes tests being submitted. Requirements for any PR:

  1. Must pass tests
  2. Must work for "path" style URLs, e.g. https://example.com/wiki_id. This was in word by @hexmode in #983 but it never passed tests
  3. Ideally would work for "domain" style URLs, e.g. https://wiki_id.example.com. However, these type of URLs are not supported in Meza yet, but @freephile has made it work in his fork (though his fork breaks path style URLs IIRC). Ref #1117 and #1118.
  4. Ideally would be backwards compatible with old links that include index.php. In other words, if someone has a bookmark to https://example.com/wiki_id/index.php/My_awesome_page it should still work even though the wiki would prefer https://example.com/wiki_id/My_awesome_page.
revansx commented 5 years ago

ok, well.. the "must pass tests" is a criteria I don't know how to meet. Are points 2-4 the test?

jamesmontalvo3 commented 5 years ago

Every pull request runs automated tests. If they fail then the pull request shows something like this:

image

If you click "details" next to the failed test you end up at Travis CI, which is what runs our tests. Here's an example of failing tests: https://travis-ci.org/enterprisemediawiki/meza/builds/530824771?utm_source=github_status&utm_medium=notification

Code that breaks tests can't be merged.