Exercise / HTMLPurifierBundle

HTML Purifier is a standards-compliant HTML filter library written in PHP.
http://htmlpurifier.org/
Other
275 stars 56 forks source link

Permissions issue when clearing cache #22

Closed alsciende closed 8 years ago

alsciende commented 9 years ago

Since I installed Exercise/HTMLPurifierBundle, I have an error every time I clear the cache on my Symfony server:

[Symfony\Component\Filesystem\Exception\IOException]
Failed to remove file "/var/www/nrdb/app/cache/prod_old/htmlpurifier/URI/4.6.0,8d03c8ec0e84e7feb92afd4c0f1735841b5fdacf,1.ser".

And indeed, the directory app/cache/prod/htmlpurifier/HTML is owned by www-data with permission 755, so my user cannot delete the files in it.

I applied the setfacl commands to set up the permissions in app/cache and app/log, but that doesn't seem to do the trick.

$ ls -la app/cache/prod
total 656
drwxrwxr-x+  11 alsciende www-data   4096 Jan 26 03:43 .
drwxrwxr-x+   5 www-data  www-data   4096 Jan 26 03:43 ..
drwxrwxr-x+   2 alsciende www-data   4096 Jan 26 03:45 annotations
-rw-rw-r--+   1 alsciende www-data 194989 Jan 26 03:43 appProdProjectContainer.php
-rw-rw-r--+   1 alsciende www-data  70159 Jan 26 03:43 appProdUrlGenerator.php
-rw-rw-r--+   1 alsciende www-data  79081 Jan 26 03:43 appProdUrlMatcher.php
drwxrwxr-x+   3 alsciende www-data   4096 Jan 26 03:43 assetic
-rw-rw-r--+   1 alsciende www-data   4904 Jan 26 03:43 classes.map
-rw-r--r--+   1 www-data  www-data 189453 Jan 26 03:43 classes.php
drwxrwxr-x+   3 alsciende www-data   4096 Jan 26 03:42 doctrine
drwxrwxr-x+   2 www-data  www-data   4096 Jan 26 03:43 fosJsRouting
drwxrwxr-x+   3 alsciende www-data   4096 Jan 26 03:45 htmlpurifier
drwxrwxr-x+   4 www-data  www-data   4096 Jan 26 03:43 http_cache
drwxrwxr-x+   2 alsciende www-data   4096 Jan 26 03:45 sessions
-rw-r--r--+   1 alsciende www-data  27882 Jan 26 03:43 templates.php
drwxrwxr-x+   2 www-data  www-data   4096 Jan 26 03:43 translations
drwxrwxr-x+ 105 alsciende www-data   4096 Jan 26 03:43 twig
$ ls -la app/cache/prod/htmlpurifier/
total 24
drwxrwxr-x+  3 alsciende www-data 4096 Jan 26 03:45 .
drwxrwxr-x+ 11 alsciende www-data 4096 Jan 26 03:43 ..
drwxr-xr-x+  2 www-data  www-data 4096 Jan 26 03:45 HTML
JeremieSamson commented 9 years ago

Same here, any update ?

adfinlay commented 9 years ago

I too have encountered this issue, I've already checked my acl settings

ediblemanager commented 9 years ago

Facing the same, so had to disable the purifier and implement another interim solution. A fix would be greatly appreciated!

MasterB commented 9 years ago

+1

My workaround: Use a separate directory outside ../cache/dev and ../cache/prod

config.yml

exercise_html_purifier: default: Cache.SerializerPath: "%kernel.root_dir%/cache/htmlpurifier"

Richard87 commented 8 years ago

Hi! I also have the same problem, running version exercise/htmlpurifier-bundle dev-master 3b5842d: (Symfony 3)

CLI user is different then apache user, but the user is also in the apache group...

Folders look like this:

[eportal@vps htmlpurifier]$ ll
total 12
dr----x--t 2 apache apache 4096 Feb 20 16:03 CSS
dr----x--t 2 apache apache 4096 Feb 20 16:03 HTML
dr----x--t 2 apache apache 4096 Feb 20 16:03 URI

And my settings:

exercise_html_purifier:
    default:
       Cache.SerializerPath: '%kernel.cache_dir%/htmlpurifier'
       Cache.SerializerPermissions: 777

NOTE The workaround does work: '%kernel.root_dir%/../cache/htmlpurifier'

spolischook commented 8 years ago

@Richard87 There are many documentation about permissions on Symonfy2 http://symfony.com/doc/current/book/installation.html#checking-symfony-application-configuration-and-setup Read Setting up Permissions block.

Richard87 commented 8 years ago

Hi, thanks for your response @spolischook , I have read those, and everything else is working flawless, its just HTML Purifier that has problems...

spolischook commented 8 years ago

@Richard87 did you can find piece of code that produce that error?

alister commented 8 years ago

The issue is that the SerializerCacheWarmer in the bundle only creates the directory within the Symfony cache when a site is deployed.

A @Richard87 says, the subdirectories and files within that directory are only created when the main HTMLPurifier code is run, and so they are created as the webserver user, not as the deployment user that runs the initial warmup. This means that the same user that deploys the code probably does not have access to delete the files later.

Having the bundle's SerializerCacheWarmer also run code that produces the (all the possible) subdirectories for the serialised files would, very likely, solve the issues.

spolischook commented 8 years ago

@alister maybe create all possible subdirectories will solve the issue. Any way I need to have some code to see that can resolve the issue.

alister commented 8 years ago

The problem of the files being generated by HTMLPurifier in its initial run (by the web-user) would still remain. I've tried to find a simple 'build cache' function within the library - without luck. So, I'm taking a more 'brute force' approach - running purify() within the warmer.

Since a Cache Warmer can take a service as an argument, I'm adding in in there:

<service id="exercise_html_purifier.cache_warmer.serializer" class="%exercise_html_purifier.cache_warmer.serializer.class%">
  <argument>%exercise_html_purifier.cache_warmer.serializer.paths%</argument>
  <argument type="service" id="exercise_html_purifier.default" />
  <tag name="kernel.cache_warmer" />
</service>

And the PHP:

SerializerCacheWarmer::__construct(array $paths, HTMLPurifier $htmlPurifier);
    $this->htmlPurifier = $htmlPurifier;

To build the caches [HTML, CSS, URI] within the warmer, run the purify() function with enough content to need to build all of the definitions:

$this->htmlPurifier->purify('<div style="border: thick">-2</div>');
$this->htmlPurifier->purify('<div style="background:url(\'http://www.example.com/x.gif\');">');

These create the seriali[zs]ed files at the same time, and with the same user as the rest of the symfony cache.

The warmup() function can take multiple directories, but currently, this code only passes in a single service ("@exercise_html_purifier.default"), which would probably not be enough to handle more than one path being setup.

All of the code/config to fix this issue is above, I'm happy to make that into a PR if you want - but I believe it would solve the 95+% case of the files being created with the web-user, and not the deployment user (causing problems on attempted deletion).

alister commented 8 years ago

Are there any thoughts on my PR:37?

spolischook commented 8 years ago

So, for resolve this issue you need use same user for console and web applications. For Nginx. Edit user in nginx.conf (in ubuntu /etc/nginx/nginx.conf) http://stackoverflow.com/a/18004182/2119164 For php-fpm. Edit www.conf (in ubuntu /etc/php5/fpm/pool.d/www.conf) http://unix.stackexchange.com/a/30191/144170 For apache. Edit apache2.conf

This enough for development. If you have any questions or related issues, feel free to reopen this issue.

Richard87 commented 8 years ago

Hi, yes, unfortunatly that doesn't help me, and many others I would guess... I'm hosting many projects on my server, all with there own users, so changing the user of php-fpm doesn't really help.

Also, I think your solution is more of a workaround that a solution/fix to a problem that needs a solution/fix...

Just my 0.02$...

spolischook commented 8 years ago

@Richard87 your users must use deploy tools like Ansible, and cache directory must be shared between deployed versions of application. I just try to change my php-fpm user and install symfony3 with HTMLPurifierBundle and have the same problem with other cache directories, like Twig. So it's not just problem of HTMLPurifierBundle, it's a common problem with cache in Symfony.

alister commented 8 years ago

You are right - having a library write into the main cache directory during runtime will often end up with a similar problem as started out this very issue back in Jan 2015. If you setup, say Doctrine Cache to a file in app/var/cache, or var/cache, you would see it as well, which is why I put those into a system temp directory, like /tmp/app-name/cache.

The reason it's not generally an issue with Symfony applications is that it builds the cache once - when the application is first run, and often during the command-line-run initialisation phase, before it is then put live onto the web.

The htmlpurifier library writes its own cache, but those cached files are not currently properly controlled by this bundle that wraps it - despite previous attempts with file permissions and filesystem ACLs. My PR#37 puts just enough work inside the bundle's own cache warmer to be able to ensure that the files that are written, are also owned by the same user as the rest of the cache, as Symfony and other bundles do.

I would very happily describe it as an 'elegant hack', and the only way that it could be more 'elegant' would be to have a simple function within the htmlpurifier library that would do everything required to build all the caches that are possible to exist, and be used later. At a minimum though, I believe that function would simply contain something very similar to the two lines I added to the existing attempt of using the SerializerCacheWarmer::warmUp function.

Richard87 commented 8 years ago

Well, Symfony respects the umask command in my index file, so each cache file and folder is created with the correct user, group and with the correct permissions...

Unfortunately Html purifier ignores these settings...

Den tir. 23. aug. 2016, 09:35 skrev Alister Bulman <notifications@github.com

:

You are right - having a library write into the main cache directory during runtime will often end up with a similar problem as started out this very issue https://github.com/Exercise/HTMLPurifierBundle/issues/22#issue-55458721 back in Jan 2015. If you setup, say Doctrine Cache to a file in app/var/cache, or var/cache, you would see it as well, which is why I put those into a system temp directory, like /tmp/app-name/cache.

The reason it's not generally an issue with Symfony applications is that it builds the cache once - when the application is first run, and often during the command-line-run initialisation phase, before it is then put live onto the web.

The htmlpurifier library writes its own cache, but those cached files are not currently properly controlled by this bundle that wraps it - despite previous attempts with file permissions and filesystem ACLs. My PR#37 puts just enough work inside the bundle's own cache warmer to be able to ensure that the files that are written, are also owned by the same user as the rest of the cache, as Symfony and other bundles do.

I would very happily describe it as an 'elegant hack', and the only way that it could be more 'elegant' would be to have a simple function within the htmlpurifier library that would do everything required to build all the caches that are possible to exist, and be used later. At a minimum though, I believe that function would simply contain something very similar to the two lines I added to the existing attempt of using the SerializerCacheWarmer::warmUp https://github.com/Exercise/HTMLPurifierBundle/blob/3b5842de5e7ffee2c360eb7746519797a2fbb0f6/CacheWarmer/SerializerCacheWarmer.php#L29 function.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/Exercise/HTMLPurifierBundle/issues/22#issuecomment-241651268, or mute the thread https://github.com/notifications/unsubscribe-auth/AFe702ndZiTfBO1b4syj8u-5mmSum4qJks5qiqLRgaJpZM4DXFEF .

spolischook commented 8 years ago

@Richard87 Please update HTMLPurifierBundle to the last version and test again.

Richard87 commented 8 years ago

---Hi, I'm on dev-master (3b5842de5e7ffee2c360eb7746519797a2fbb0f6)---

Updating... Thanks :)

spolischook commented 8 years ago

@Richard87 let me know if this resolve your issues.

dannyvw commented 7 years ago

I have still the same issue. Any idea?

JeremieSamson commented 7 years ago

I solved mine with this

exercise_html_purifier:
    default:
      Cache.SerializerPermissions: 0777

tell me if it's working for you

dannyvw commented 7 years ago

I already have that in my configuration. Current config

exercise_html_purifier:
    editor:
        Cache.SerializerPath: '%kernel.cache_dir%/htmlpurifier'
        Cache.SerializerPermissions: 0777

All permissions are correct, except for htmlpurifier

adilbaig commented 7 years ago

I also had the same issue with htmlpurifier and doctrine. We have various crons running symfony commands as a different user. We clear the cache on deploy. This plugins generates caches late and, like doctrine, it doesn't give world-writeable permissions to the file. So www-data is not able to refresh the cache.

The solution I came up with was to split the root path for caching of web requests. I patched app.php and app_dev.php like this:

...
require_once __DIR__ . '/../app/AppKernel.php';

class AppKernelPatched extends AppKernel
{

    public function getRootDir()
    {
        return parent::getRootDir() . '/../app/';
    }

    public function getCacheDir()
    {
        return $this->rootDir . '/cache/web/' . $this->environment;
    }
}

$kernel = new AppKernelPatched('dev', true);
...

If you don't mind losing the cache every once in a while, this should work for you.

dannyvw commented 7 years ago

We clear the cache also on every deploy. It look likes that htmlpurifier cannot create the directory with the right permissions with a cache warmup. We have already a custom cache directory defined.

    public function getCacheDir()
    {
        return dirname($this->getRootDir()) . '/var/cache/' . $this->getEnvironment();
    }
HeahDude commented 6 years ago

https://github.com/heahprod/HTMLPurifierBundle/pull/2 should fixed it for custom profiles too, if it eventually got merged in #46.

Cheers!