[2013-02-28] Require ADMIN for ?flush=1 (stop denial of service attacks)

silverstripe-issues commented 11 years ago

created by: @chillu (ischommer) created at: 2013-02-28 original ticket: http://open.silverstripe.org/ticket/8290

See https://groups.google.com/forum/?fromgroups=#!topic/silverstripe-dev/XDUDZtr9Gbk

Already enforced for flush=all, but not for flush=1.

Ensure that you can execute the flush unauthenticated if Database::is_ready() == false, because otherwise we're getting into a chicken and egg situation: Can't log in due to stale manifest (and SQL errors etc), can't clear stale manifest because not logged in.

tractorcow commented 11 years ago

How does this look?

// In Director.php

    /**
     * Check requested flush level, given permission is available, or the 
     * database is not ready.
     * 
     * @param string $value An optional flush value to require. E.g. 'all'
     * @return boolean A flag indicating that the requsted flush level is given and authorised
     */
    public static function is_flush($value = null) {

        // Rule out unflushed cases
        if(empty($_GET['flush'])) return false;
        if($value && ($_GET['flush'] != $value)) return false;

        // Allow flush in dev or CLI
        if(self::isDev() || self::is_cli()) return true;

        // If a database error would otherwise prevent authentication, permit flush
        if(!DB::isActive() || !DB::getConn()->hasTable('Member')) return true;

        // Safely check permission, erring on the side of safety
        try {
            $permission = @Permission::check('ADMIN');
            if(is_bool($permission)) return $permission;
        } catch(Exception $ex) {
            exceptionHandler($ex);
        }

        return true;
    }

simonwelsh commented 11 years ago

The problem here is flushing the manifests is resource intensive. That code won’t work until after the manifests, especially the config one, have been loaded.

UndefinedOffset commented 11 years ago

What if the flush requires the site to be in dev mode? Then at least so long as the site is in live mode it won't do it? If I remember right in 3.0 flush is passed into the various manifest builders in Core.php which would be before the session/database is ready right?

hafriedlander commented 11 years ago

This can't check the environment - normally you set dev mode through _ss_environment, which gets loaded as part of config loading, so too late.

Some things we could do, most of which are annoying one way or another

Monitor when flush was last called, and forbid flushing too frequently
Make flushing CLI-only
Have a secret token in the codebase, require flush to be passed the token as a param (flush=mysecret instead of flush=1)
Do flush checking after startup, as per @tractorcow but if flush is indicated then make the browser reload with a one-time use token as a GET parameter

That last one is nicest, but if the site doesn't work at all because flush is needed you'd have to fall back to CLI. You might be able to combine it with error checking, so it the site has an error the flush happens anyway.

camspiers commented 11 years ago

I think that I have a possible solution to this, critique it at will. And I have started work on a PR.

The idea is instead of passing in a flush value to the instantiation of SS_ConfigStaticManifest, SS_ClassManifest, SS_ConfigManifest and SS_TemplateManifest you let them run as they would without that value (so, if the cache exists then use it, if not then generate it and set a prop indicating that a generate has happened). Later on in the execution (after the db is ready in main.php) you check for flush and admin, and then call the regenerate method on the manifests if they weren't already generated.

If you want me to continue with this solution I can likely finish today.

simonwelsh commented 11 years ago

What happens if you reference a new class in your _config.php? Flush needs to be able to happen before user code is loaded.

tractorcow commented 11 years ago

I wonder if this could be solved in a simpler way. Could there be a physical token (i.e. a file) which either locks or unlocks the ability to flush? A site could be 'secured' from flushing by having a blank _flush_exclude in the site root.

This is similar in the way that the _manifest_exclude is used to exclude a directory from being autoloaded.

This eliminates the necessity for DB or class loading, but has the downside of requiring physical access to the site root. The default case could be as the code currently works - No file means flushing is available at all times, and it could be something uploaded to productions servers when not being maintained.

camspiers commented 11 years ago

@simonwelsh That is a good point. I didn't think about that. :)

camspiers commented 11 years ago

Monitor when flush was last called, and forbid flushing too frequently

Definitely an option, but hopefully we can find something nicer.

Make flushing CLI-only

I really like this option, but I imagine that it would be a support issue. I work on one site where I don't actually have SSH access and can only run cron jobs.

Have a secret token in the codebase, require flush to be passed the token as a param (flush=mysecret instead of flush=1)

The file could be generated if it didn't exist, meaning security by default. Would it be placed in users codebase i.e. mysite and not framework or cms? I imagine this would be necessary as at least in my case, deps are handled by composer and therefore can't be modded.

Do flush checking after startup, as per @tractorcow but if flush is indicated then make the browser reload with a one-time use token as a GET parameter

Can you expand on this more?

hafriedlander commented 11 years ago

Can you expand on this more?

Two requests.

First request looks like ?flush=1. After core/Core.php is included, Director::is_flush is called. It sees this is a legitimate flush (because is_dev, or Permission::check or whatever), so does two things. First, it writes a file somewhere like $secret_token="4asdf7";, then it does header("Location: $current_location?flush=4asdf7"); die;.

Second request, before core/Core.php, checks flush against token in written file. Matches, so (a) deletes file, and (b) actually does the flush

Issue: core/Core.php might error out because the manifest needs flushing (new class referenced in _config.php a la @simonwelsh). So we need to detect an error when including core/Core.php, and if there is one, and flush=1, do the same thing as that first request anyway (write token, redirect), without actually calling Director::is_flush.

chillu commented 11 years ago

+1 for Hamish's approach. BTW, I've introduced a Security.token YAML config in order to safely switch databases via a web URL for Behat behaviour testing. At the moment it needs to be generated manually (through RandomGenerator and dev/generatetoken. Theoretically we could avoid the redirect with that. It would only become useful for flush=1 once its generated by default, presumably in the installer.

camspiers commented 11 years ago

Would this approach still support cli? I guess if flush=1 and cli then regenerate?

sminnee commented 11 years ago

IMO, CLI should just work, so the initial call can just run if Director::is_cli() is true.

chillu commented 11 years ago

Here's my current status. Its complex, and ugly. https://github.com/chillu/silverstripe-framework/tree/pulls/flush-ddos

Redirecting from Core.php is difficult, because our logic for retrieving the URL from the CGI environment isn't trivial, and not encapsulated. I don't want to start refactoring this too heavily because Im aware of Andrew's work for 3.2 bootstrapper. Even if we get the URL, a redirection won't work because the "Class not found" E_FATAL will have already been sent to the browser, meaning we can't send further headers. A temporary display_errors=0 or ob_start() could fix this, but seems like a hack? I've tried registering a fallback autoloader that triggers the manifest build instead of the PHP shutdown function, but that triggers for each class_exists() check in core for a rightfully missing class, e.g. Translatable.

Using a one-time use token has one serious disadvantage: You can't refresh the browser window on the redirection target. Particularly in template development where you need to flush every couple of seconds to check a template change, it will get highly annoying to manually rewrite the URL. We could start the session earlier in case a flush is requested explicitly, and make it a session-based token?

For now, instead of redirecting I'm building the manifest in the first request (using the PHP shutdown function). Then a message gets output after the fatal error asking to refresh the page: http://monosnap.com/image/wAQsf2wQeQSLN3SbtXuEtXjc7.png Its not pretty, and depending on your error_reporting() might not even show...

Here's some test cases I've run:

Database ready, dev mode, first build on empty cache: Flush triggered by initial loadManifest() call
Database not ready, dev mode, first build on empty cache: Builds db, flush triggered by initial loadManifest() call
Database not ready, live mode through _config.php, first build on empty cache: Builds db, flush triggered by initial loadManifest() call
Missing and undefined class. live mode through _config.php: Regenerates twice, shows error message (same on reload, as expected)
Missing but defined class, live mode through _config.php: Shows error, works on refresh
Missing but defined class, live mode through _config.php, ?flush=1: Regenerates, works on first load
Missing but defined class, dev/build, live mode through _config.php, without ?flush=1: Shows error, works on refresh

P.S.: We wouldn't have this problem in the first place if we used PSR0 autoloading, and removed our weird reliance on implicit registration through PHP interfaces like TemplateGlobalProvider and i18nEntityProvider, right?

chillu commented 11 years ago

Talked to Sam, action points:

introduce a JS refresh on class error (and plaintext error on CLI)
move DB::connect to Core.php, introduce EnvironmentUnconfiguredException to

Rough HTML code (needs cross browser testing to ensure it works after outputting the E_FATAL error message):

<p id=pleaseRefresh>Please refresh</p>
<script>
document.getElementById('pleaseRefresh').innerHTML = 'Refreshing...';
setTimeout(function() {document.location.refresh() }, 1000);
</script>

chillu commented 11 years ago

Update: https://github.com/chillu/silverstripe-framework/commit/9d6acfc1f4f85295bebd5796892bf7ea0ace4c0d @sminnee Could you peer review please?

I've added an infinite loop protection to the JS refresh, with a flushed=1 marker. This is necessary to avoid cases where a manifest flush doesn't fix the issue. Side effect is that any subsequent refreshed on this new URL with the marker won't cause a manifest build, but that's unavoidable.

On CLI, I'm just doing a die() with a note that manifest has been cleared, and the user should rerun the command.

More tests performed:

Database not ready, live mode through _config.php, first build on empty cache through CLI: Builds cache, doesn't build DB (on php cli-script.php dev), but works
- Missing but defined class, live mode, CLI: Shows error, works on refresh
- Missing but defined class, live mode, CLI, flush=1: Works on first try
- No missing classes, live mode, flush=1, not logged in: Works on first try
- No missing classes, live mode, flush=1, not logged in: die() with a message asking to login
- No missing classes, live mode, flush=1, logged in: Works on first try

chillu commented 11 years ago

Fixed for 2.4 as well. Created pull requests on github.com/silverstripe so somebody can pick up the commits in my absence (starting in a couple of hours). Please don't forget to squash the 3.0 commits after peer review, before merge. Also needs to be merged and tested into 3.1 and master.

P.S.: I don't envy @ajshort in merging this all into https://github.com/ajshort/sapphire/tree/composer :/

tractorcow commented 11 years ago

Hi @chillu, I'm a little concerned about some of these changes. moving DB::connect into core.php is going to change some of the assumptions around the code that may be relied on by other applications. A single large file is quite rigid.

I can only speak of my own experience with the code, but for instance my dynamiccache module relies on core.php to initialise the configuration system, but intercepts requests to conditionally delay or prevent execution that requires database connection. Such a change as proposed would make this impossible (at least, not without duplicating the entire code in an external module).

Wouldn't you agree that 3.1 RC is close enough to warrant delaying such a significant change?

You could also argue that I'm the only one affected though. :) It might not be the best example, but there will often be times where individually projects will need to interact with the bootstrapping process, by having their own bootstrapping files (while including core ones it depends on).

I know of about 2-3 projects where this has been necessary (one where DB connection had to be conditionally executed based on cached results, another where session loading had to be delayed to integrate with another application on the same domain).

Could core.php be broken up from a "one file to rule them all" bootstrapper into something that ties together a set of individually includable files? One for each of the sections under?

///////////////////////////////////////////////////////////////////////////////
// MANIFEST

You could still get all the functionality in this pull request, but presented as a toolbox of steps, rather than one mandatory process.

ss23 commented 11 years ago

@tractorcow While it's a valid point, I feel like leaving this in for longer could be worse. Not that I think people going to DoS you is likely, but if it exists, someone might do it, and then your only real option is to hack on the codebase, which would be worse than a (somewhat minor) compatibility break, or use a WAF or something to block the requests.

Then again, it hasn't been an issue for anyone so far... shrug

tractorcow commented 11 years ago

@ss23 That's appreciated and understood, thanks. :) Please check the considerations I've noted and let me know if it's acceptable to allow Core.php to remain extendable.

hafriedlander commented 11 years ago

For those concerned about the possibility of an attacker attempting to use this prior to a fix being released, there are server-level workarounds that will help mitigate attacks, including:

Forbid the use of the flush GET parameter except from a whitelist of IPs. Something like this in the base .htaccess should work:

RewriteCond %{QUERY_STRING} ^flush\b [NC,OR]
RewriteCond %{QUERY_STRING} &flush\b [NC]
RewriteCond %{REMOTE_ADDR} !^xx.xx.xx.xx$
RewriteRule .* - [F,L]

You could also do something similar at the top of main.php, like

if (isset($_GET['flush']) && $_SERVER['REMOTE_ADDR'] != 'xx.xx.xx.xx') die;

You could also use a secret second GET parameter or other methods of identification if you're also worried about IP spoofing.

Rate limit requests (with the flush GET parameter, or just in general) - for example with mod_qos under Apache

moveforward commented 11 years ago

@hafriedlander thanks for the suggestions. As there has been some discussion around this issue on the NZ PHP Users Group, it's also good to see the discussion here and to have some interim workarounds until a fix is released.

simonwelsh commented 11 years ago

The rewrite rules will need to also block ?flush\b and &flush\b (not sure if RewriteCond lets you use word-boundary matches)

hafriedlander commented 11 years ago

Good point Simon, I've updated that comment. Apache does seem to support \b.

sminnee commented 11 years ago

After Ingo's work was completed Hamish has reworked this quite substantially. The issue that he raised - and felt was a showstopper - was that it was now impossible to bypass DB::connect(), as it was being called in Core.php. Although I had initially this as a benefit - that way your complete SilverStripe bootstrap in non-main.php-based-scripts was simply including Core.php - it's too late in the release cycle to make that kind of API change, and we'd instead need to address that in 3.2 with some kind of way of looking after databaseless SilverStripe scripts.

The code is interesting and experimental, which is usually a dangerous combination, but it has the benefit of constraining our changes to main.php (as well as introducing a new include file with a couple of support classes).

Other users of the framework that bypass main.php (cli-script.php, custom scripts that include Core.php as a bootstrap) won't have flush protection provided, but they also won't have their behaviour changed. This late in the release cycle of 3.1 (not to mention the patching 3.0 and 2.4). Since most other scripts are not "normal" scripts - they are things like unit test scripts, bootstraps, command-line scripts, cronjobs - the absence of flush protection is an acceptable limitation, but we should include a mention of this in the release notes.

Key to Hamish's work are two new classes, ErrorControlChain and FlushToken. Both should be considered experimental (and will be marked as such in the relevant docblock), both will likely undergo significant refactoring in 3.2. ErrorControlChain is extremely interesting as it let us trap fatal errors, and could be the basis of a substantial improvement in SilverStripe's error control. FlushToken is too heavily hard-coded to the single case of the ?flush=1 querystring parameter, but could form the basis of improving the access control on some of the remaining dev-only querystring variables.

tractorcow commented 11 years ago

Thanks @sminnee , that really encompasses my feelings a lot more eloquently. Thanks also @hafriedlander / @chillu for your hard work. :)

bummzack commented 11 years ago

This latest commit triggers an infinite redirect loop whenever I add ?flush=1. What could be the reason for this?

hafriedlander commented 11 years ago

@bummzack What should happen is a single redirect that adds a second parameter, flushtoken.

If you look at the request log, do you see that token being assigned? What version of silverstripe are you using?

bummzack commented 11 years ago

@hafriedlander Yes, I get the flushtoken, but the ?flush=1 parameter is also there. So something like this: ?flushtoken=<hash>&flush=1

I'm using the latest version (HEAD) from the master branch.

hafriedlander commented 11 years ago

@bummzack Does the hash keep changing with every loop? Or does it stay the same every redirect?

If it keeps changing, for some reason core/startup/ParameterConfirmationToken.php is failing in checkToken - maybe there's a problem finding the temporary directory and it's unable to write the token.

If it stays the same, I'm not sure why that would be happening - the only time the new code will force a redirect is if there isn't a valid token provided.

bummzack commented 11 years ago

It is changing with every request. I have a setup where the framework and cms folder are symlinked into all installations on my test-machine. So this might be the root of the problem.

I'm defining the TEMP_FOLDER constant to set the tmp folder for silverstripe. Maybe there's a problem with the code not checking for TEMP_FOLDER and using some hard-coded string instead?

hafriedlander commented 11 years ago

Yes, ParameterConfirmationToken ignores TEMP_FOLDER, or rather, TEMP_FOLDER isn't set when ParameterConfirmationToken needs it, long before core/Core.php is included or any configuration is loaded.

I'm not sure of a solution ATM other than you hacking ParameterConfirmationToken to have the temp directory hardcoded in. I'll think about it.

bummzack commented 11 years ago

Apparently BASE_PATH isn't set at the time either and this seems to be the root of the problem. In a symlinked installation, it will then point to the folder where framework resides and not the webroot where framework is symlinked to.

Changing line 23 of ParameterConfirmationToken.php from:

$basepath = rtrim(dirname(dirname(dirname(dirname(__FILE__)))), DIRECTORY_SEPARATOR);

to:

$basepath = rtrim(dirname(dirname($_SERVER['SCRIPT_FILENAME'])), DIRECTORY_SEPARATOR);

solves the issue.

Is there any code outside the framework dir (eg. something that's not part of the SilverStripe Framework but rather part of my project) that runs before the code in ParameterConfirmationToken.php so that I might set BASE_PATH manually to something else? I'm setting the BASE_PATH in my _ss_environment.php file, so maybe including this somehow would be a good thing?

bummzack commented 11 years ago

I looked into this a bit more and I isolated the main issue here. It is that you have two entry-points into the token-generator, both with a different context.

The first time you check for a token is right before anything gets started up, then you have to check for the token again after the redirect to actually perform the flush.

In between these two steps, the whole configuration and DB connection gets initialized.. as do ENV settings that the user made. So if you check a constant somewhere in code like this:

if (defined('BASE_PATH')) {
    $basepath = BASE_PATH;
}

You get two different outcomes when BASE_PATH doesn't match your path assumtion/fallback exactly.. the first time you run this, BASE_PATH can't be set and therefore the fallback will come into play. The second time the token is being accessed, BASE_PATH could be set, because various configuration parameters will be loaded. So now this statement will be true and the code will use the BASE_PATH constant (which might be different than the fallback).

My suggestion here is: Skip the check for BASE_PATH constant entirely and be the authority about the path where the token should be located... but most importantly: ensure that the token path will be the same across both runs, eg. avoid external constants/variables as much as possible to minimize the risk of overrides.

hafriedlander commented 11 years ago

@bummzack Thanks for tracking that down. I added that check in imagining a situation where you've pre-defined BASE_PATH, but you're right it's not actually useful (and in fact harmful) because you can't pre-define BASE_PATH on the first call.

I've raised PR #2252 and #2253 for 2.4 & 3.0. Once the 3.0 PR is merged in, we'll merge 3.0 into 3.1, then 3.1 into master.

sminnee commented 11 years ago

I think this is unworkable. The recommended approach for handling symlinked environments is to create an index.php that defines BASE_PATH and then includes main.php, and this change would break that.

My recommendation would probably be to remove the reliance on base path for these tokens, and instead store them in /tmp. Alternatively, can we put _ss_environment.php inclusion above all of this malarky?

sminnee commented 11 years ago

Hamish and I have talked about this and the agreed solution is to move part of Core.php that sets up all the define()s into core/Defines.php, and then including that in main.php before the fush token is created. That way, we can just rely on TEMP_FOLDER.

hafriedlander commented 11 years ago

OK, another attempt, @sminnee & @bummzack do you see any problem with #2254 and #2255?

bummzack commented 11 years ago

@hafriedlander looks good. I agree that including _ss_environment.php beforehand is a good idea...

@sminnee Didn't know about that recommended approach for symlinked environments. Thanks for pointing it out.

tractorcow commented 11 years ago

I noted an issue with the current 3.1 framework which is probably related to the recent changes. Removing a module from an existing project and attempting to flush results in a crash. SS_ConfigManifest seems unflushable if there is an error which should require the cache to be invalidated.

Sorry I haven't been following the progress of this update as of late, so I can't say what's causing this.

camspiers commented 11 years ago

I have experienced the same thing. I renamed a module and had to do a manual delete of the cache.

On Tuesday, 23 July 2013, Damian Mooyman wrote:

I noted an issue with the current 3.1 framework which is probably related to the recent changes. Removing a module from an existing project and attempting to flush results in a crash. SS_ConfigManifest seems unflushable if there is an error which should require the cache to be invalidated.

Sorry I haven't been following the progress of this update as of late, so I can't say what's causing this.

[image: image]https://f.cloud.github.com/assets/936064/837647/9723ba48-f30b-11e2-97eb-c7f96846d358.png

— Reply to this email directly or view it on GitHubhttps://github.com/silverstripe/silverstripe-framework/issues/1692#issuecomment-21372431 .

hafriedlander commented 11 years ago

@tractorcow @camspiers ErrorControlChain should detect if there's an error & suppress it, allowing the flush even when in live mode & not logged in, except with BASE_URL isn't defined, in which case it allows the error through - looks like there's a code path where removing a module triggers an error before BASE_URL gets set.

The new patch #2254 and #2255 should fix that, since BASE_URL is set prior to entering the chain

sminnee commented 11 years ago

@tractorcow @camspiers - I think this might actually be fixed by @hafriedlander's ErrorControlChain.

However, Robert Curry has noticed a bug with the current implementation that I think is a showstopper. If you use "@" to suppress an error message (not especially tidy, but it happens) then the error is no longer suppressed.

camspiers commented 11 years ago

@sminnee I will test against the latest. Thanks

hafriedlander commented 11 years ago

@camspiers I haven't merged into 3.1 yet, so you'll need to use latest 3.0

camspiers commented 11 years ago

Oh, right. :)

camspiers commented 11 years ago

I'm on 3.0 @ 88d0cbe and I am experiencing the following after renaming a module:

screen shot 2013-07-23 at 11 03 13 am

And then when I try a flush:

screen shot 2013-07-23 at 11 03 21 am

PHP Version info if relevant:

PHP 5.3.26 (cli) (built: Jul  6 2013 17:24:30)
Copyright (c) 1997-2013 The PHP Group
Zend Engine v2.3.0, Copyright (c) 1998-2013 Zend Technologies

hafriedlander commented 11 years ago

@camspiers Can you confirm https://github.com/silverstripe/silverstripe-framework/pull/2257 fixes the issue?

camspiers commented 11 years ago

Yep. That fixes it :)

I still get the error without the flush (is that expected), but then I can add the flush and it works.

hafriedlander commented 11 years ago

Your issue should now by fixed in 3.1 @camspiers and @tractorcow. If you could test it again & let me know if you still have problems, that'd be great.

silverstripe / silverstripe-framework

[2013-02-28] Require ADMIN for ?flush=1 (stop denial of service attacks) #1692