pkp / pkp-lib

The library used by PKP's applications OJS, OMP and OPS, open source software for scholarly publishing.
https://pkp.sfu.ca
GNU General Public License v3.0
306 stars 445 forks source link

OJS is very slow with a large table of contents (hundreds of items) #6511

Closed diegoabadan closed 1 year ago

diegoabadan commented 3 years ago

We published conference proceedings in OJS. When we updated one of them to OJS 3.2.1-2, access to the table of contents became extremely slow. We note that there is no specific SQL that takes too long, but the number of SQL queries is high.

Nothing appears until all queries are completed, and the sum of them plus PHP processing is a bottleneck.

Highlights:

asmecher commented 3 years ago

@diegoabadan, is your database server on the same system as the PHP is executing?

diegoabadan commented 3 years ago

No, the database is on a nearby server that also serves other instances of OJS, connected via a private network.

The persistent = Off option doesn't seem to help.

asmecher commented 3 years ago

@diegoabadan, we've had a report or two that the overhead imposed by running a database that's on a separate server is considerable. It's probably the cumulative effect of network overhead for each of the queries, and the total effect is considerable. Is it an option to try running the database on the same system?

diegoabadan commented 3 years ago

I will do the test and return with the result.

thanks!

diegoabadan commented 3 years ago

Running the database on the same server as the application made the time much shorter. It now typically takes a little over 4s to load the first resource, which can range from 3s to just over 6s.

A typical journal on this same server, using an external database, takes less than 2s.

External test images to help:

Proceedings

  1. Test 1:

image

  1. Test 2: image

Typical Journal in some server and other DB: image

Servers details: 8GB 4 colors SSD (both external database and application server)

diegoabadan commented 3 years ago

For this usage scenario for Proceedings, I was able to get around the problem by enabling the option web_cache = on after https://github.com/pkp/pkp-lib/issues/6512 correction

I suspect that with the growth of rolling pass publications, journals with a more intense publication flow may be affected.

diegoabadan commented 3 years ago

I suspect that with the growth of rolling pass publications, journals with a more intense publication flow may be affected.

Hi! We have an example of a journal with a database on the same server and it takes more than 5s to load the TOC:

http://www.seer.unirio.br/segurofundamental/issue/view/417 http://www.seer.unirio.br/segurofundamental/issue/view/380

It's not a time that makes use unfeasible, but I consider it a negative use experience.

Note: This time is for repeated accesses, where there is already some cache optimization.

OJS 3.2.1-4

diegoabadan commented 3 years ago

We received another complaint of TOC loading slowness: https://portalperiodicos.unoesc.edu.br/siepe/

Database in same server.

As this is an OJS with multiple journals, we cannot enable caching just for this publication.

asmecher commented 1 year ago

@diegoabadan, I think this will be at least partially resolved with https://github.com/pkp/pkp-lib/issues/7304, which should be released in 3.3.0-15. There is a linked commit there if you're able to try applying it to a 3.3.0-x release with your test data to verify.

The upgrade to 3.4.0 (when released) may also surface some underlying issues that could cause extremely low performance; for example, it adds formal foreign key constraints (https://github.com/pkp/pkp-lib/issues/6093) which will break if your database happens to accidentally mix MyISAM and InnoDB storage engines. Running a test upgrade to 3.4.0rc3 might be a good idea.

diegoabadan commented 1 year ago

Hello Alec.

I tested it with the version currently in use and with one generated from Git (future 3.3.0-15) and I didn't notice a relevant difference in times: in both between 8s and 10s to load the current TOC.

I cleared the caches and tested a few times with each version.

I'm still going to experiment with OJS 3.4-rc3 and return here. Thanks for the warning, we always use InnoDB.

asmecher commented 1 year ago

@diegoabadan, can you share a database dump (anonymized as necessary) with me privately?

diegoabadan commented 1 year ago

I'll check how to anonymize and send it to you. Thanks!

asmecher commented 1 year ago

I have a few SQL statements lying around that do at least some of the work:

UPDATE users SET email=CONCAT(username,'@mailinator.com'), password=SHA1(CONCAT(username,username));
UPDATE authors SET email=CONCAT(author_id,'@mailinator.com');
UPDATE plugin_settings SET setting_value='' WHERE setting_name LIKE '%ssword%';
UPDATE plugin_settings SET setting_value='' WHERE setting_name LIKE '%apiKey%';
UPDATE plugin_settings SET setting_value = 0 WHERE setting_name = 'enabled' AND plugin_name = 'acronplugin';

This is just the basics to avoid engaging with APIs or emailing people; you might want to go further.

diegoabadan commented 1 year ago

Dump sent (e-mail).

diegoabadan commented 1 year ago

Related post on the PKP forum: https://forum.pkp.sfu.ca/t/suggestion-to-improve-performance/78230

diegoabadan commented 11 months ago

I tested it on a TOC with around 548 published items. Approximate loading times:

On the current server with OJS 3.3.x: 9s On new server with 4 dedicated CPUs, NVMe disk and OJS 3.3.0-16: 7s Upgrading to OJS 3.4: 5s

The new server is running Ubuntu 22.04 and PHP 8.1.2.