cpan-testers / cpantesters-project

A meta-project for tracking CPAN Testers project goals
6 stars 1 forks source link

CPANTesters Backend processing is slow #14

Closed preaction closed 6 years ago

preaction commented 7 years ago

One user is reporting more than 6 hours from when a report is sent to the Metabase for it to be available on the CPANTesters website. I need to trace the code involved and figure out where all the bottlenecks are: Is it the polling frequency? Is it the server being overloaded? Is it the file generator processes not running often enough?

barbie commented 7 years ago

This is likely to be the Builder not the feed from the Metabase. Sometimes the order and frequency of building can need tweaking. See $quickhit in Process(), within https://metacpan.org/source/BARBIE/CPAN-Testers-WWW-Reports-3.57/lib/Labyrinth/Plugin/CPAN/Builder.pm

The general settings are in the config/settings.ini file, and there are two threads running (authors and distros), but I did want to try figuring out a MessageQueue system, that could run with multiple listeners, but sadly never got round to it.

preaction commented 7 years ago

For the message queue thing, I've got an interesting solution for that: Mercury. It's a message broker that uses websockets, is pure-Perl, easy to install, and lightweight. If the Metabase wrote updates to Mercury, and then the CPANTesters processor workers subscribed to Mercury, we could make that work. I was already kind of thinking about using Mercury to set up a Metabase feed, or even the actual logging (#13).

The only thing Mercury will need before it can be used is an authentication scheme to ensure that only authorized users are allowed to use it. Messages should only be sent by us, but in the future we may want messages to be received by anyone (and could then relax that part of the broker).

preaction commented 6 years ago

The new Minion-based backend has reduced processing times to mere minutes. It's not 100% perfect: The Metabase shim API directly inserts a Minion job. It would be better if it was less tightly coupled: Each test report write should pass a message on the message queue, and a backend process should read that queue to for processing. But, it works fast, so this ticket can be closed.