xLink / CybershadeCMS

[Abandoned][Broke] Repo for CybershadeCMS
1 stars 0 forks source link

Sub-Caching - Split the cache up #25

Closed MantisSTS closed 11 years ago

MantisSTS commented 11 years ago

So I was speaking to Linky about this one earlier.

We need to split up the caches to be in separate cache files and which one gets read from depends on which module is being used.

The core routes (etc) should be kept in the default file, for example, cache/cacheroutes.php where the custom modules one should be kept in cache{MODULE_NAME}_routes.php (or something similar)

With 50,000 routes in the system (which is not OTT), the CMS takes 120MB of memory per user, jsut to include the cache files.

My suggestion is, as I said, split up which caches gets loaded by splitting up the files.

NoelDavies commented 11 years ago

But the system will NEVER have 50k routes. Cysha's routing system is based on URL patterns, not exact urls, meaning we can instantly bring that down to maybe 100 routes.

xLink commented 11 years ago

Dan we have no idea what a potential setup could have in there, although I agree with the way it's setup there shouldn't be anywhere near 50k routes in there due to their dynamic nature, it might very well be an idea to split em up and see if we get any performance gains from it and or look to alternative measures

NoelDavies commented 11 years ago

Can somebody create a test file then with benchmarks? On 30 Nov 2012 08:53, "Dan Aldridge" notifications@github.com wrote:

Dan we have no idea what a potential setup could have in there, although I agree with the way it's setup there shouldn't be anywhere near 50k routes in there due to their dynamic nature, it might very well be an idea to split em up and see if we get any performance gains from it and or look to alternative measures

— Reply to this email directly or view it on GitHubhttps://github.com/cybershade/CSCMS/issues/25#issuecomment-10882423.

MantisSTS commented 11 years ago

The thing is, it could have 50k routes if they have set it up incorrectly. Which I know we shouldn't have to account for, having said that, we will get the blame if that is the case and they will blame the CMS.

I brought this up because this is the main reason why Drupal and various other CMS' are slow, as they load in stupid amounts of URL routes per page which obviously brings the memory usage right up.

If we can sort it into seperate cache files for the routes (ie per module or something) then we can reduce that load drastically.

I would love to write test files with benchmarks but I have no experience with the caching system yet.

NoelDavies commented 11 years ago

It couldn't have 50k routes at all unless they're benchmarking.

Think about it, the forum would have like 10 routes top. That includes edit, reply, delete, viewThread, viewCat, viewIndex and others.

I can't see it having 50k at all, unless each url is static (no params in the url).

MantisSTS commented 11 years ago

Thing is people have pages that they would want a friendly URL for, which could easily top a good 10k pages, plus any other modules which they could have in there, etc.

My point stands at it should be done, don't ignore it out of laziness.

And yes, a lot of them probably are static, but that still causes a shed load of memory usage.

On 4 December 2012 09:09, Daniel Noel-Davies notifications@github.comwrote:

It couldn't have 50k routes at all unless they're benchmarking.

Think about it, the forum would have like 10 routes top. That includes edit, reply, delete, viewThread, viewCat, viewIndex and others.

I can't see it having 50k at all, unless each url is static (no params in the url).

— Reply to this email directly or view it on GitHubhttps://github.com/cybershade/CSCMS/issues/25#issuecomment-10989157.

NoelDavies commented 11 years ago

That's the thing we're trying to get away from, having shite-loads of static urls in the system. I can fully understand where you're coming from, I just think we should be influencing and educating people to our new routing system, instead of letting developers store 50k routes in the sys.

MantisSTS commented 11 years ago

Okay so how would they setup a static URL if they wanted one without the use of the URL sys and without modifying the htaccess.

I agree, that we shouldn't comprimize on the dynamic nature of the URL routing system, but we should have a system in place which they can use for static urls and lots of them too if they want.

CSCMS commented 11 years ago

There's fuck all stopping em doing so it's they just need to understand its not the right way to do it In our system & it can and will slow it to shit that's all

MantisSTS commented 11 years ago

Which is exactly why I'm putting this improvement forward. It shouldn't slow the system down at all (within reason). If you can make the system better/faster then we should. Even if we encourage them to use the correct way of doing it.

It should still be done.

xLink commented 11 years ago

tbf if only for the performance gain, i say we do it, split the routes up into modules, base routes always get loaded after the modules routes(if they exist) & job should be a good one, atleast then its only loading 50 routes at top end as opposed to the entire db worth of em

Proposed filename structure: cacheroutes{NAME}.php ^ means the cache removal mechanism will still work too, just need to fuck with it in the cache class :)

MantisSTS commented 11 years ago

Sounds good to me.

Get going. Chop chop. xD <3

NoelDavies commented 11 years ago

So old man, how you gonna know what routes to load? :)

MantisSTS commented 11 years ago

Query the db for a match on the module Route, then load in the cache for that module? 1 db hit is still better than 196mb of memory usage ;P

NoelDavies commented 11 years ago

But that means we first have to check all of the cache, then hit the db, generate the cache, and go.

No matter what, you have to load all the cache's in, unless you hit the db for a module first, which means this is all pointless... Unless i've completely missed the point, in which case, slap me and point me on the right direction.

MantisSTS commented 11 years ago

What you talking about?

Why would you have to check the cache to hit the db then gen the cahce?

You would hit the db first to find the direct route, then you would load all routes in for that module.

That way you can hit the db once, and get the associated cache file for subsequent calls and urls etc

On 4 December 2012 14:46, Daniel Noel-Davies notifications@github.comwrote:

But that means we first have to check all of the cache, then hit the db, generate the cache, and go.

No matter what, you have to load all the cache's in, unless you hit the db for a module first, which means this is all pointless...

— Reply to this email directly or view it on GitHubhttps://github.com/cybershade/CSCMS/issues/25#issuecomment-10999516.

xLink commented 11 years ago

yeah thinking about it dan is right, we are using the routes to figure out what we want to load, we dont know before hand, so splitting em up per module is bloody useless xD

NoelDavies commented 11 years ago

Richie, that's against the point of the cache... That means we're essentially generating a new cache for that module on each page load.

.. Thinking about it, what if we only kept dynamic urls (the routes with params) in the cache, and put all the static ones in the db? As long as we index the pattern column, we should be okay regarding static urls.

Thoughts?

(NBB: http://octodex.github.com/images/gangnamtocat.png)

xLink commented 11 years ago

lmfao @ the cat..

as for your idea, its only going to be any good if we can determine if the url itself is dynamic or static before we hit the db, otherwise they may aswell stay in the cache, but i dont see any good way of figuring out if the url is static or not, without checking if its in the db first

/about-us vs /forum/view/thread1.html

MantisSTS commented 11 years ago

Yeah, I've already suggested the static vs dynamic urls. My idea was to cache both, but in separate cache files, so you dont have to load all of the cache in at once.

Like you've both pointed out, there is no easy way of doing that.

NoelDavies commented 11 years ago

Yeah I just realised the flaw in my idea, which was the same flaw I pointed out in yours, God damn.

We all need to sit and think about this one, My only real idea is that we cache the dynamic ones, and since there shouldn't be THAT many of them, we loop through those, if there's no match, we hit the db for a match. Failing a positive match in the db, we trigger off a 404.

MantisSTS commented 11 years ago

Surely that'll be slower than just including the whole cache set?

xLink commented 11 years ago

lmao, may aswell leave it as it is then, cause all its doing is testing the static ones first, which as you said, there shouldnt be many in there anyway, im imagining only ones needed for core pages, contact, about, etc

NoelDavies commented 11 years ago

Issue closed then.

NoelDavies commented 11 years ago

I'm going to put forward that we configure the pattern column in the #__routes table to become an index. Should be a lot faster when querying.

MantisSTS commented 11 years ago

Yeah that sounds like a plan. I still think that maybe including the cache files as php arrays is a slow way of doing the cache too. Imo, we should investigate the fastest way of parsing the cache data, (JSON, serialized, php arrays, etc).

NoelDavies commented 11 years ago

Completely Agree, Writing the cache's can't be too pretty, and a better way, for example using JSON, should be tested. Let's just agree on one thing and stay clear of XML yeah? ;)

Although, that being said, It may take a little extra to write them, but to read them is very fast, as they're just included directly into the construct.

MantisSTS commented 11 years ago

Y U NO LIKE XML?!

Jk. Totally agree. :P

On 5 December 2012 10:05, Daniel Noel-Davies notifications@github.comwrote:

Completely Agree, Writing the cache's can't be too pretty, and a better way, for example using JSON, should be tested. Let's just agree on one thing and stay clear of XML yeah? ;)

— Reply to this email directly or view it on GitHubhttps://github.com/cybershade/CSCMS/issues/25#issuecomment-11035832.

MantisSTS commented 11 years ago

Why did you close it?

This isn't finished with yet, "Writing the cache's can't be too pretty, and a better way, for example using JSON, should be tested. Let's just agree on one thing and stay clear of XML yeah? ;)"

xLink commented 11 years ago

it is finished with, the cache stores are meant to host small portions of data from the db and other stores, so we haven't got tons of code hitting the db for crap we don't need. The only instance of it not working very well is your dodgy 50k routes, which wont happen anyway.

It doesn't make sense to encode the caches in json, cause php will still have to decode it when it loads it in anyway, which is just going to add more overhead to the issue. atm it var_exports the arrays, so we can just load them in and away we go, as little overhead as possible & it works across the board. Thus being closed.