bnomei / kirby3-boost

Boost the speed of Kirby by having content files of pages cached and a fast lookup based on uuids
https://sakila-with-boost.bnomei.com/
MIT License
51 stars 0 forks source link

Cannot get Boost to generate boostIDs #10

Closed trych closed 2 years ago

trych commented 2 years ago

Hi there,

I am trying to set up Boost for the first time and I am not sure, if I am doing something wrong, but I don't get it to work.

Initial situation I have a site where I need to handle a lot (~5000) of front end form entries. Every entry ends up as its own page. Now in a first step, I want to allow to navigate those entries in the panel. As the page already slows down when I throw 1000 test entries at it, I thought that the Boost plugin could help in that case.

First question: Is that even the correct use case for the plugin? To speed up handling thousands of pages in the panel?

Now, I tried to set this up. I put this in my blueprint:

site/templates/application.yml

fields:
  boostid:
    type: boostid

Then I created a page model for the application page:

site/models/application.php

<?php

class Application extends \Kirby\Cms\Page {
    use \Bnomei\PageHasBoost;
}

Then I call this in some template (any template is good for that, right? I just would need to open the template once?):

site/models/someTemplate.php

<?php
  kirby()->impersonate('kirby');
  site()->boost();
 ?>

However, after calling the template, when I check the panel, the BoostID field is still empty. (I should see something there when it's working, right?)

I tried then setting the cache driver in the config file:

site/config/config.php

<?php
return [
    'bnomei.boost.cache' => [
        'type'     => 'apcu',
    ]
];

But I get this error message: apcu

When I switch to memcached, I get this error message instead: memcached

And when I set it explicitly to 'type' => 'file', just nothing happens.

I am running this on PHP 8.0 on a local Laravel Valet setup, as described here. I am using Kirby 3.6.2.

Any idea, what could be going wrong?

Thanks a lot!

bnomei commented 2 years ago

seems like you did everything right. lets try to figure out why its not working as intended.

let's leave the cache driver set to file at first. this will make it easier to see if caches are created. BUT eventually you will have to change it to another since they all perform better than file. most webservers should have apcu.

My first guess it that the pagemodel is not working. Please rename the class from Application to ApplicationPage. thats something kirby needs but the docs do not explicitly say.

I will create an issue for the team about this.

then verify that the boostid field is visible but empty in the panel - not just in the txt files. this way we can make sure kirby has no issues connecting the field itself.

next create a new application page in the panel. it should get an boostid automatically.

then let's try again how many pages are being index. i expect any number but 0.

<?php
  kirby()->impersonate('kirby');
  $count = site()->boost();
  var_dump($count);
 ?>

there might be an issue with the sitemethod and pagesmethod both being named boost. i will check that in the meantime.

bnomei commented 2 years ago

i updated my plugins readme to make this more clear as well

Since in most cases you will be using Kirbys autoloading for the pagemodels your classname needs to end in Page. Like site/models/article.php and ArticlePage or site/models/blogpost.php and BlogpostPage.

bnomei commented 2 years ago

In regard of your question if the boost plugin is suited for that usecase. yes. most likely you will benefit from it.

if you have a lot of applications pages in the same folder kirby will parse all of them since they are siblings. parse as in take a look at the folder structure and the name of the content file (what template). it will not read any files yet.

if you view a list, filtered, sorted whatever of these in the panel then kirby will load the txt content files for the title field. this is the moment boost plugin will make it faster. since most servers keep hot files in RAM (and thats very fast) you will only notice an improvement in speed once you need more files that the server keeps in RAM. but 5k should be more than the average server keeps in RAM. your local dev env probably keeps none at all.

and thats why its critical you dont use the file cache driver. with the file cache driver you just duplicate the content txt file of each page and add 2 more per page for boosts tracking feature. but with any other driver it will be fast.

bnomei commented 2 years ago

i stand corrected. the docs explain it well enough. just easy to miss 😁

trych commented 2 years ago

Thanks a lot for the detailed answers, @bnomei !

The initial issue was indeed the wrong class name in the page model. Sorry, I missed that during my setup process. 🤦‍♂️ If I give this the correct name and set the cache driver to file, it works and assigns the boost ids to all the pages.

I am afraid I have some further questions:

1.) If I switch this back to apcu, I again get the error that I posted above. I assume this might be due to me testing this in my local setup? I will give this a try on the shared server later and see if that works.

2.) I mainly need the speedup for handling a lot of records displayed in a pagetable, so while it's not a full text search per se, Kirby will need to look into the records to display the required info. (see screenshot for example, all entries are fake data) 20220418-161645_Screenshot_folgenlos test

So, I assume Boost will help speed up things in this case. What I don't quite understand yet: Does it then work automatically, or do I need to make use of the boost() method somewhere and therefore overwrite some parts of the pagetable plugin?

3.) As records are not created through the panel, but via a form from the frontend, the boost id is not created automatically. I assume, during page creation, I just need to call page->boost() on the page then and then it's included in the boost setup, correct?

4.) I have read in several places that in a setup with that many records, it might speed things up when I distribute the records into subfolders instead of having them all on one main level. Is this still the case when using boost or does it not make any difference anymore in this case?

Thanks a lot!

bnomei commented 2 years ago

1) valet does not come with apcu. maybe valet+ does not sure. you can install it with pecl but for me it did not work. to test stuff like apcu that i usually spin up a simple docker image

2a) pagetable sadly is not very fast with big collections and that will not change. boost will make loading the dataset faster and make the routes respond faster but pagetable still has to manage all the data. it really depends on what you need pagetable for. i would advise yout to use a default page section with customised html output for the info option and fixed sorting. just because performance is best then.

2b) thats the nice thing about boost. once you have it setup and pages have a boostid the rest will happen magically in the background. new pages will get an id and added to the cache, further content reads will be from cache not file. you dont need to call boost yourself anywhere.

3) if you create a page via kirbys usual createChild with template application it should get an id with due to the trait you added to the model. no need to call boost anywhere manually. if you have your own constructor you need a little different setup then do let me know but since you did not have a model before i guess you dont.

4) yes what you read is true for file-based approach. it still affects overall performance when boost but if you usecase is to always parse all applications then there is not way around it. kirby will still "parse" the folder and filenames of the full directory before boost kicks in and does its magic with the content files. personally i would try to find a way to group them into 500-1000 pages like per month or so. you could create collections for "current-applications" that grab most recent 3 months. or group them like a mailbox with unread, read, archive folders to make at least the unread group as small and fast as possible.

trych commented 2 years ago

Hi @bnomei and sorry for the long radio silence, I was going through an unpleasant covid infection with my family.

I implemented this now as you described and got it to work just fine with apcu cache on a test deployment on an uberspace shared host. Unfortunately, on the actual shared server where I needed to deploy the site (strato), neither apcu nor memcached seemed to work. I guess, that's just not an option then on some web hosting services? Or is this something I can usually turn on in the hoster's settings? I was digging around a bit, but didn't find anything about apcu or memcached. So I set it to file eventually, it works now, but I guess it doesn't really help my gain performance for now.

Anyways, currently it looks like there will be not nearly as many records as were predicted by the client, so I think everything will be fine for this particular project. ;) For future projects I will be a bit more aware of what shared hoster packages have to offer.

Thanks also for the explanation of the subfolder splitting, I set it now up in a way where I have a subfolder for each day and this is much easier to handle when I browse the folder structure.

I guess the issue is solved, thanks a lot for your help!