brefphp / symfony-bridge

Bref runtime to run Symfony on AWS Lambda
MIT License
47 stars 14 forks source link

Symfony's capability to run on read-only out of the box #21

Open Nemo64 opened 4 years ago

Nemo64 commented 4 years ago

I kind of started this library by positing my code which copies cache files.

But https://github.com/brefphp/symfony-bridge/pull/17#discussion_r407191775 made me check again and realise that symfony (4.4) works without modifications if the cache is properly warmed and configured.

So I investigated.

setup

I use a symfony multipage application which i'm not totally comfortable to share (it is a real project) but its a symfony 4.4 skeleton with doctrine/orm and a user entity in a crud for user as well as some email handling

I changed the following files for logging:

# config/prod/monolog.yml
monolog:
    handlers:
        main:
            type: stream
            path: "%env(resolve:LOG_FILE)%"
            level: debug
        console:
            type: console
            process_psr_3_messages: false
            channels: ["!event", "!doctrine"]
# .env
# this is for local development as usual
LOG_FILE=%kernel.logs_dir%/%kernel.environment%.log
# serverless.yaml
provider:
  # [...]
  environment:
    LOG_FILE: php://stderr

And for completeness I changed the cache directory:

# config/packages/cache.yaml
framework:
    cache:
        directory: '/tmp/pools'

It's important that the directory option only changes the directory of the cache.adapter.filesystem (defined here) and not the cache.adapter.system which is the cache we warm.

I also note that I test using symfony 4.4.7 with php 7.3 so not the newest setup but a realistic one.

result

It seems to work ok.

I have a lot of warnings in the log from not being able to write like these:

[2020-04-12T21:30:21.967332+02:00] cache.WARNING: Failed to save key "isWritable.a%3A3%3A%7Bi%3A0%3Bs%3A50%3A%22Symfony%5CComponent%5CSecurity%5CCore%5CUser%5CUserInterface%22%3Bi%3A1%3Bs%3A5%3A%22roles%22%3Bi%3A2%3Ba%3A0%3A%7B%7D%7D" of type boolean: file_put_contents(/var/task/var/cache/prod/pools/nkosAA2WOE/5e936c4de02108.84920836): failed to open stream: Read-only file system {"key":"isWritable.a%3A3%3A%7Bi%3A0%3Bs%3A50%3A%22Symfony%5CComponent%5CSecurity%5CCore%5CUser%5CUserInterface%22%3Bi%3A1%3Bs%3A5%3A%22roles%22%3Bi%3A2%3Ba%3A0%3A%7B%7D%7D","exception":"[object] (ErrorException(code: 0): file_put_contents(/var/task/var/cache/prod/pools/nkosAA2WOE/5e936c4de02108.84920836): failed to open stream: Read-only file system at /var/task/vendor/symfony/cache/Traits/FilesystemCommonTrait.php:96)"} []

These warnings appear when a form is accessing an Entity. I found this issue talking about it.

Other than that I can't find any immediate issues. The console works fine too.

So what's are we doing here?

When I started using symfony I could not get my project to work on a read-only filesystem (I don't know why anymore). I used the documentation which had the rewriting to the /tmp folder in it but that was too slow for my taste so I started to copying the cache to tmp with some exception wich is the state this library is at right now.

Now that I know that symfony almost works out-of-the-box I would probably choose a different path. One could overwrite/decorate the cache.adapter.system service (maybe even in a normal bundle) to include overlay logic and to my current knowledge it would work perfectly.

Now there are still advantages to the copy approach here: It works in every case, even when the cache isn't warmed or isn't properly warmed. It even works in dev mode with profiler as long as you didn't deploy the profiler folder. Even though you will get problems when multiple lambda instances are running there. The question is if that is important enough.

Opinions? Did I overlook something?

mnapoli commented 4 years ago

Thank for you reporting this, that is consistent to what I have seen.

I would definitely go the route of keeping the cache that we deployed (without copying/symlinking).

I'm not sure I understand what you mean by "it might be a much more elegant idea to patch the system and filesystem cache and not copy anything", but it's late for me ^^

Do you mean that: all cache should stay in var/ except for user cache (the "pools" stuff)?

Nemo64 commented 4 years ago

What I mean is a custom filesystem cache adapter that is aware that the target directory might not be writable and creates an overlay.

I have quickly thown this together (untested and probably not the most beautiful):


use Symfony\Component\Cache\Adapter\AbstractAdapter;
use Symfony\Component\Cache\Adapter\FilesystemAdapter;
use Symfony\Component\Cache\Marshaller\MarshallerInterface;

class OverlayFilesystemAdapter extends AbstractAdapter
{
    /**
     * @var FilesystemAdapter
     */
    private $base;

    /**
     * @var FilesystemAdapter|null
     */
    private $overlay = null;

    public function __construct(string $namespace = '', int $defaultLifetime = 0, string $directory = null, MarshallerInterface $marshaller = null)
    {
        parent::__construct($namespace, $defaultLifetime);
        $this->base = new FilesystemAdapter($namespace, $defaultLifetime, $directory, $marshaller);

        // if the directory is not writable, create an overlay
        if ($directory !== null && !is_writeable($directory)) {
            $this->overlay = new FilesystemAdapter($namespace, $defaultLifetime, null, $marshaller);
        }
    }

    protected function doClear(string $namespace)
    {
        if ($this->overlay !== null) {
            throw new \LogicException("Can't clear overlayed filesystem cache.");
        }

        return $this->base->doClear($namespace);
    }

    protected function doDelete(array $ids)
    {
        if ($this->overlay !== null) {
            throw new \LogicException("Can't delete overlayed filesystem cache.");
        }

        return $this->base->doDelete($ids);
    }

    protected function doFetch(array $ids)
    {
        $values = $this->base->doFetch($ids);

        if ($this->overlay !== null) {
            $values = array_replace($values, $this->overlay->doFetch($ids));
        }

        return $values;
    }

    protected function doHave(string $id)
    {
        if ($this->overlay !== null) {
            if ($this->overlay->doHave($id)) {
                return true;
            }
        }

        return $this->base->doHave($id);
    }

    protected function doSave(array $values, int $lifetime): array
    {
        if ($this->overlay !== null) {
            $this->overlay->doSave($values, $lifetime);
            return; // don't save to the base implementation
        }

        return $this->base->doSave($values, $lifetime);
    }
}

This could be configured in place of the normal cache adapter

# config/packages/cache.yaml 
framework:
    cache:
        system: cache.adapter.overlay_filesystem # the hypothetical overlay adapter

        app: cache.adapter.filesystem # still the default filesystem adapter for max compatibility
        directory: '/tmp/pools' # but write app cache into /tmp since there is normally nothing warmed in it
Nemo64 commented 4 years ago

Just copying the pools folder is what we do right now. In fact we probably have the most elegant implementation possible without overwriting private details or defining another cache adapter.

The problem is that the pools directory is hardcoded in the system cache definition based on the %kernel.cache_dir%, otherwise we wouldn't need all those generated symlinks because we could just overwrite that directory definition.

That said, we could also recommend defining the system cache as a normal filesystem cache like this:

# config/packages/cache.yaml 
framework:
    cache:
        system: cache.adapter.filesystem # instead of the default cache.adapter.system
        app: cache.adapter.filesystem # this is the default
        directory: '%env(CACHE_POOL_DIR)%' # something that is "%kernel.cache_dir%/pools" during warmup but "/tmp/pools" in the lambda environment
        # because the system cache is now using the default filesystem adapter, it is also influenced by the directory option

Then we could copy just that one directory during boot. An overlay implementation would remove the need to copy during boot but introduce a slight runtime overhead.

Now going back to the current implementation: The nice thing is that it is fairly easy to explain how to setup because there is no config file editing needed and works with or without pre-warming and it has nearly no runtime overhead compared to any overlay implementation besides the longer boot time.

So just to summarize my thoughts:

Nemo64 commented 4 years ago

I did some more research. The smallest solution i found to have no cache write fail (in my app) while deploying as much cache as possible is this:

framework:
    cache:
        # those 2 options are the default, but it helps understanding
        system: cache.adapter.system
        app: cache.adapter.filesystem

        # make it possible to move the app cache dir out of the "%kernel.cache_dir%/pools"
        # important: the "cache.adapter.system" is hardcoded to "%kernel.cache_dir%/pools"
        directory: '/tmp/%kernel.environment%_pools' # only influences the filesystem adapter

        pools:
            # overwrite some system caches which are not properly warmed
            # those components will now write to the "cache.app" which is writable
            cache.property_info: ~
            cache.serializer: ~
            cache.validator: ~

I defined the 3 pools which shadow the private implementations of those 3 components because the pools option defines a service with the same name. Those caches will then no longer write into the system cache but in the app cache which is a normal filesystem cache by default which i can dynamically change to /tmp using the directory option.

Again, It relies on the private service definition of those components. It also isn't perfect because i throw away some cache entries of the validator because the cache isn't fully warmed. But it's still interesting to see which components create the problems. Although I wouldn't recommend it, it's pretty damn elegant by being small.

Nemo64 commented 4 years ago

The more I dig, the more i find that you two (@Nyholm, @mnapoli) were already on it long before me 😉 https://github.com/symfony/symfony/issues/29357, https://github.com/symfony/symfony/issues/31214 it seems that the answer always is to enable apcu and to ignore the warning. This isn't a good solution because there might be cache warnings i want to have. But interesting is that in the second issue the recommendation is to overwrite the private cache defintion. So I think what I'm doing in my last comment might be better than i think?

More information about cache the problematic cache warmers

It seems that the annotation warmer looks into every file in the composer classmap: Symfony\Component\HttpKernel\DependencyInjection\AddAnnotatedClassesToCachePass This means to completely warm the cache, you have to run composer install -o first so the classmap includes all files.

The validator and the serializer component, on the other hand, does only warm xml or yaml definitions. Symfony\Bundle\FrameworkBundle\CacheWarmer\ValidatorCacheWarmer::extractSupportedLoaders Symfony\Bundle\FrameworkBundle\CacheWarmer\SerializerCacheWarmer::extractSupportedLoaders

The property info component has no warmer at all (or i didn't find it).

So contributing to symfony would require to build a cache warmer for the property-info component and extending the validator and serializer warmers to check all classes similar to the validator warmer.

mnapoli commented 4 years ago

This means to completely warm the cache, you have to run composer install -o first so the classmap includes all files.

Good to know! That is definitely something that we can document.

The validator and the serializer component, on the other hand, does only warm xml or yaml definitions. The property info component has no warmer at all (or i didn't find it).

Just to be sure I understand, does that mean that these components will not be compatible with a read-only cache, even if it is warmed before deploying?

Nemo64 commented 4 years ago

@mnapoli Correct. The warmers would need to be improved to work similar to the annotation warmer.

But you can move these specific caches into /tmp as I described in https://github.com/brefphp/symfony-bridge/issues/21#issuecomment-612695723 without affecting system caches which can be properly warmed.

mnapoli commented 4 years ago

OK I think I'm starting to get it, thanks :)

At the moment, you identified 3 components that write to the cache at runtime. In the end, there might be more than that.

Would it make sense to:

?

Nemo64 commented 4 years ago

Good idea~ I found 4 caches which inherit from cache.system directly. Those are:

But we can just redefine cache.system as a normal filesystem cache. This doesn't even need to be aware of private implementation details.

framework:
    cache:
        system: cache.adapter.filesystem # this is the important part
        app: cache.adapter.filesystem
        directory: '/tmp/%kernel.environment%_pools'

There is still a lot of pre warmed cache, not just the compiled container. There are also twig templates, translations and doctrine proxies.

I just ran that though the same benchmark i just wrote in https://github.com/brefphp/symfony-bridge/issues/18#issuecomment-612914234 with php 7.4 and preload enabled.

REPORT RequestId: 77001e0e-43e8-4da9-8c7b-5523fd5a7d41  Duration: 655.99 ms Billed Duration: 1300 ms    Memory Size: 1024 MB    Max Memory Used: 131 MB Init Duration: 628.13 ms    
REPORT RequestId: 2598f4a9-f93e-4579-b98d-e2ebc05ec9a5  Duration: 58.39 ms  Billed Duration: 100 ms Memory Size: 1024 MB    Max Memory Used: 131 MB 

So I don't know how much my app uses the pre-warmed cache in that case but i assume the annotation cache is probably the biggest hit. There seems to be a ~40 ms hit on my first request with obviously no hit on the second because it is correctly cached.

But the hit is smaller than the hit of copying the pools folder so this could be a good compromise. Especially since this is a 3 line config solution that uses a documented and public api.

mnapoli commented 4 years ago

But the hit is smaller than the hit of copying the pools folder so this could be a good compromise.

Right!

Especially since this is a 3 line config solution that uses a documented and public api.

Yes, I like when it's simple and stable ^^ At least for a first version that we can release and test with. We can still experiment around these ideas, but we'll have a stable baseline that people can start using and that we can compare to.

Nemo64 commented 4 years ago

Now instead of blacklisting specific caches, i can also whitelist so i know that the annotation cache works fine so...

framework:
    cache:
        # [...]
        pools:
            cache.annotations: 
                adapter: cache.adapter.system
REPORT RequestId: 8706fe45-7872-4c11-b249-dbe2a5ee24a6  Duration: 643.48 ms Billed Duration: 1300 ms    Memory Size: 1024 MB    Max Memory Used: 130 MB Init Duration: 623.18 ms    
REPORT RequestId: 888fad12-26b1-4549-b7b4-9e835693ddba  Duration: 54.51 ms  Billed Duration: 100 ms Memory Size: 1024 MB    Max Memory Used: 131 MB 

Does not seem to make a significant difference in my 1 test at ~10 ms. But it's just one test and probably not worth the added complexity.