rswiki / RSData

An interface for interacting with RuneScape APIs
GNU General Public License v3.0
0 stars 0 forks source link

Implement whitelist and named APIs at mw:rsdata. #2

Open onei opened 9 years ago

onei commented 9 years ago
onei commented 9 years ago

A simpler solution to this might be to use a config variable which just lists supported URLs with optional parameters, e.g. $1, $2, etc. Numbers are just more flexible imo.

I'd keep away from specifying a scheme in the URLs (http/https) just in case that changes in the future with little notice. I would hardcode the runescape.com domain into the code for comparison of the supported URLs however (as a general failsafe).

As for the content type, I'd look at explicitly disallowing text/html or allowing application/ and text/<plain/xml/csv>. I'm not sure how feasible either are or even if jagex do actually specify a content type in their headers.

TehKittyCat commented 9 years ago

Do you have an example of what you mean? Do you mean config variable as in a PHP variable?

I agree, HTTP/2.0 means going forward HTTPS will become the standard (although forcing HTTPS on runescape.com returns 404s). On second thought, hardcoding runescape.com as the domain should be enough. Everything uses services.runescape.com anyway, so restricting to that subdomain wouldn't do much and even if a rouge/hacked admin adds a secure URL the code doesn't use cookies or POST, so nothing could be logged-in to.

Hiscores lite returns text/html unfortunately, so restricting to content type isn't practical. Clan hiscores properly uses text/comma-separated-values and the JSON hiscores APIs return application/json.

onei commented 9 years ago

I'm envisioning something like:

$wgRSDataUrls = array(
    // either a normal array
    'url1',
    'url2',
    'url3',

    // or an associative array
    'foo' => 'url1',
    'bar' => 'url2'
);

It might not seem so easy to update, but we can get it updated via WikiFactory which is a just a quick special contact request away. As for the potential concern as to which URLs are enabled, we can just output it on the help read out for api.php. As long as we make the code itself relatively flexible, I think we'll be fine.

As long as the URLs are restricted, the content type restrictions could be skipped (as they'd be implicitly controlled). I'd still maintain the runescape.com check as there are so many variations of the domain being used by unscrupulous sites which would alleviate an concerns about spelling mistakes.

onei commented 9 years ago

As far as using the API, I'm thinking using parameters such as:

{
    action: 'rsdata',
    rsdataurl: 'foo',
    rsdataparams: 'bar|baz|quux'
}

Where rsdataurl matches one of the named keys in $wgRSDataUrls, and rsdataparams matches any use of $1, $2, etc. In an ideal world I'd like to document the parameters, but I can't think of any case where there's more than one parameter to be used. A slightly different setup that might fix that could be:

$wgRSDataUrls = array(
    'rs3hs' => array(
        'url' => '$url',
        'params' => array(
            '1' => 'playername',
            // or (as it's sort of implied if we use $1, $2, ...)
            'playername'
        ),
);