f3-factory / fatfree-core

Fat-Free Framework core library
GNU General Public License v3.0
208 stars 88 forks source link

Suggestion: transient cache mechanism #98

Open xfra35 opened 8 years ago

xfra35 commented 8 years ago

Would it make sense to have the possibility to cache values for the duration of the script?

I'm thinking about mapper optimization (lazy-loaded mappers on a collection) but that could be useful for long running scripts as well.

As for the syntax, I was thinking about $ttl=-1 to trigger the behaviour.

So if we had $cache->set('mykey',123,-1) or $db->exec($sql,$args,-1), the stored values would be available during the script life and destroyed at shutdown. I guess they would have to be stored in an array instead of the cache backend.

It may be a dumb idea after all, so I'm waiting for your feedback =)

KOTRET commented 8 years ago

If triggering cache with -1 this has to be saved where? HDD for saving memory? Memory for saving time? Even within the same request the targets for saving these data is different: you would store large data on a disk and often required data in memory (as simple key/value-store).

In shutdownhandler some resources may not be available anymore (i did not test...), so it might be dangerous to clear the cache afterwards.

xfra35 commented 8 years ago

Hmmm I don't think there's an alternative to storing data in arrays. Otherwise data would be shared among scripts, which is not what we're after in this case.

KOTRET commented 8 years ago

so this should be a replacement for a simple class that act as key-value-store and accepts key + value or key + function (with delayed execution on request and replacement of the value) as setter and key as getter? At least this would be the plain php way.

sidenote: by passing -1 as ttl i'd expect caching forever.

Rayne commented 8 years ago

I can only think of one situation where the $db->exec($sql,$args,-1) example has benefits: white box unit testing where I know the query which will be generated and used.

Why cache the query and then execute the same query again in the same process? Wouldn't reusing the result be a better solution? Can you explain your lazy-loading idea?

@KOTRET I would expect the same, however the Cache is already using 0 for caching forever:

If $ttl is 0 then the entry is saved for an infinite time. Otherwise, the specified time, in seconds, is used as TTL.

ikkez commented 8 years ago

Sound like you want to add a transient cache implementation that just saves anything in an array instead of a cache backend. That should be as easy as writing a custom Cache Class that implements all needed methods (get, set, del, keys) and just put everything into a global array.. check the predis thread on the board

regarding mapper optimization, please tell us more. As far as I can see, mappers are already "lazy-loaded". The result of the db is stored in $this->query and the mapper object itself is loaded with the factory method once you move the pointer to that record. edit: ah no, I was wrong.. they are already all assembled before they leave the select method.

xfra35 commented 8 years ago

The custom Cache class is a good idea, but is not very practical at the moment because of the impossibility to handle multiple cache backends. Or am I wrong about this point?

About the lazy-loading part, here's what I had in mind:

class Product extends DB\SQL\Mapper {
  function getCategory() {
    return (new Category)->findone(array('id=?',$this->category));
  }
}

When looping through 100 products spread over 4 categories, we need a transient cache mechanism if we don't want to execute 100x SELECT against the "categories" table (when we need only 4).

Of course this can be achieved at the app level, but I thought that maybe the framework could help easing that kind of task =)

KOTRET commented 8 years ago

The request is perfectly valid for data that never changes during a request, although i'd have solved this in the Category-class with a static array :grin: so how you would control caching in this case? globally or somehow per query or as annotation? Overriding findone in Category?

xfra35 commented 8 years ago

i'd have solved this in the Category-class with a static array

That's how I would do it at the moment. I'm just wondering if the framework can't help us on this.

so how you would control caching in this case?

Per query. A single app may need all caching use cases at the same time: some queries could be perfectly valid for long-time caching, some others could benefit from a transient caching, and the remaining queries could require a refresh on every call.

That's why I came with the -1 ttl. My idea was to trigger a transient mode in Cache. After all, this feature could be useful for any type of data, not only db queries (e.g: file contents).

ikkez commented 8 years ago

Well the issue I see is even if you "cache" your categories in an array, when you have 40 categories instead of 4 which are all used in by your 100 products, you'll need at least 40 extra queries to the database (the N+1 query problem). You could easily fix that at app level: just pull all categories and save them in an array. This can be cached for a longer time, since categories do not change that often. In the Cortex plugin, it uses an Identity Map to save such related Models. So while you are looping the products and access the first category, it fetches all categories used by all products you are currently looping and caches them in the Identity Map, which is basically just an array, keyed by the used primary key of that model. On the next iteration it looks into the map first, to get the needed subset, instead of querying the database, so it's only 2 queries for 100products and 40 categries, not 41 or more.

Nevertheless, I wish we could have cache-tags, so we can give cache entries a label, to be able to wipe all cache entries of a specific tag at once. This is useful when you do some queries with joins or adhoc conditions that use data you are about to update. Maybe tag-based cache entries could also have a default ttl or a flag to be only transient.

KOTRET commented 8 years ago

@ikkez: the usecase is somewhat different: of course you possibly should first gather all categories and then select the one requested from a static cache. But this needs a controller and an implementation that knows about the intention. Florent caches only the queried one - whether this is effective or not depends on datasize and the number of different requests. Its a difference in memory when using only 2 of 100 Items.

so after rethinking this, :+1: for this request

ghost commented 7 years ago

Hello Florent @xfra35 ,

The mentioned use cases are overly application specific.

Also, some of the problems you have mentioned can be resolved more effectively on the database level – views (or materialised views) combined with functions (stored procedures) and, where necessary, triggers.

Let's remind ourselves about the purpose of F3 when it was created by Bong @bcosca . It was meant to be fat-free as its very name suggests.

If I were to be frank, since the initial releases it grew perhaps even too much. I don't use most of those existing features and I wonder whether anyone really does...