idno / known

A social publishing platform.
https://withknown.com/opensource
Other
1.07k stars 196 forks source link

Homepage slowness #934

Closed markwaters closed 8 years ago

markwaters commented 9 years ago

I am running a Known blog on a Banana Pi

https://en.wikipedia.org/wiki/Banana_Pi

With just a few hundred posts , mostly images , checkins and status updates the system is nice and fast taking about 10 seconds to load and display the homepage.

I recently setup another Known instance on the same server. Into this instance I have imported about 28000 posts from my old wordpress blog without any problems.

On this larger site , after I finish editing and publishing a post I then click on the blog title to return to the homepage , this takes 59 seconds to load.

Thinking the slowness may be a disk speed issue I installed a SATA hard disk and moved the MySQL and Apache2 directories from the SDRAM card to the SATA hard disk.

Another test and it still takes 59 seconds.

I just wanted you to know as you may not have many other users who have reached 28000 posts yet.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

mapkyca commented 9 years ago

Without actually benchmarking it on a Banana Pi I'd say it's almost certainly infrastructure rather than code.

probably memory, or lack thereof, with the apache and or mysql processes.

While the datamodel, like elgg, is designed more for flexibility than raw speed, it should be able to easily handle 28000 posts. Especially when generally Known sites have far more reads than writes - it tends to be concurrent active writes, with the corresponding index updates that kills performance on databases, although even that is less of a problem with modern database engines.

paulcmal commented 9 years ago

A Banana Pi is probably more than enough to host so much data. The problem is we have a lot to optimize first.

On a per-client basis, I think #873 and #914 are relevant. Apart from this, #972 looks like a field in which "some" optimization could take place.

In the meantime, there are already a few steps you can take to reduce your loading times :

If you want to get deeper into client-side optimization, you might want to recompile nginx with the ngx_pagespeed module. It is more efficient to some extent although more intrusive (aggressively modifies some content on the fly).

I'm not sure this will allow you to run your 28 000 posts so smoothly on a daily basis (not until some serious code-side optimization has been done anyway), but at least the query cache should make the loading of the homepage faster.

Also, have you tried gathering some data about what actually takes so long? You say it took 10-60 seconds to display the page, but did your browser wait so long before actually receiving any response from the server ? This would give you a good indicator of how much you can gain on queries and data processing before digging into potentially more complex networking and client-side optimizations.

I'm sorry I don't have time to write a tutorial about using Known with these technologies although I've tried to give you a few insightful links. I run Known instances on nginx + php-fpm + mariadb, and a test one on HHVM) and they run just fine!

markwaters commented 9 years ago

Hi @paulcmal and thanks for the suggestions.

After @mapkyca's suggestion I switched the webserver from the Banana Pi to a 2Ghz dual core X64 server with 3Gb of RAM and a SATA drive. Judging by the bogomips in /proc/cpuinfo its about 4x faster. Reloading the homepage now takes 16 seconds , which makes sense.

I've used nginx previously but currently don't really want to switch from Apache , would it make a noticable difference as I thought PHP would be the bottleneck ?

In the virtualhost configuration I have 'AllowOverride None' and load the .htaccess file once as a single include , like nginx.

Installed and configured the pagespeed module for Apache2 and memcached.

This morning I have also switched from MySQL to MariaDB , something I have been wanting to do for a while , didn't realise it was so easy , thanks again!

mapkyca commented 9 years ago

This still seems a little slow, have you got indexless / slow query logging on?

kylewm commented 9 years ago

@paulcmal's suggestions awesome -- would make a great performance-tuning post about Known, but I don't think any of them are going to get the order of magnitude improvement you need. You're right that nginx/apache shouldn't make that much difference, and your system should have enough CPU and RAM that I think you can basically rule out memory issues. Tonight I will try loading up my test install with 30k fake entries and see if I can reproduce the problem...

mapkyca commented 9 years ago

Bonus points: might be worth putting a stress test in as a unit test, but disabled in the travis build.

mapkyca commented 9 years ago

Aha.... there do appear to be a few queries on the homepage that aren't using indexes very well... will have a play....

mapkyca commented 9 years ago

Question: if you limit your homepage to display just the most common entity type (e.g. status), is there any difference?

mapkyca commented 9 years ago

Or for that matter, if you select "all content" from the dropdown?

markwaters commented 9 years ago

After a few timings I am seeing a 'status updates' only display the page in about 8 seconds , while a 'all content' display takes the usual 16 seconds.

As my 'desktop' device is a sluggish cubietruck board running linux , xfce and firefox I have also tried the same tests from my phone - an old samsung s2 with firefox , here I see 'status updates' in 5 seconds , 'all content' in 11 seconds.

So both devices take twice as long for 'all content' over 'status updates'

HTH

benwerd commented 8 years ago

Leaving this ticket open, but it's worth noting that there have been some database enhancements recently which may help.

mapkyca commented 8 years ago

Absent specifics that we can look at, shall we close this?