matomo-org / plugin-QueuedTracking

Scale your large traffic Matomo service by queuing tracking requests (in Redis or MySQL) for better performance.
https://matomo.org
GNU General Public License v3.0
82 stars 34 forks source link

QueuedTracking setup on multiple servers (new guide) #134

Open mattab opened 4 years ago

mattab commented 4 years ago

Here are some notes I wrote earlier and thought it would be useful to put in the FAQ maybe?

How do I setup QueuedTracking on multiple tracking servers?

Say you have

then on each of your 4 frontend servers, you need to:

Where X is the queue ID. Each server handles 2 queues. So the 4 servers handle the 8 queues.

Queue ID starts at 0.

Notes:

okossuth commented 3 years ago

Hello Can multiple workers process the same queueid ? We have a situation were we had originally 4 workers processing 4 queues, but due to slowness in our setup those workers were not fast enough to process the queues and now we have a ton of requests pending to be processed. Can we use multiple workers to process these specific 4 queues somehow? Thanks

danielsss commented 3 years ago

Hello Can multiple workers process the same queueid ? We have a situation were we had originally 4 workers processing 4 queues, but due to slowness in our setup those workers were not fast enough to process the queues and now we have a ton of requests pending to be processed. Can we use multiple workers to process these specific 4 queues somehow? Thanks

+1

okossuth commented 3 years ago

Forgot to mention we are using matomo 3.14.1

tsteur commented 3 years ago

Hi @okossuth @danielsss Multiple workers work on the same queue automatically if you don't set the queue-id option. However, they don't work on it at the very same time in parallel. They work on it one after another. It's not possible that multiple workers work on the very same queue in parallel (only one after another) as otherwise the tracked data could end up wrong and random visits with eg 0 actions could be created etc.

uglyrobot commented 3 years ago

Just a suggestion, if we used lpop or better blpop then that would eliminate potential race conditions, allow use of only one shared queue, and unlimited workers processing the same queue with no need for complicated locking. It would also scale to any level. We had our workers stop for a day now we have a 60GB queue size that we are trying to catch up with, but it's taking forever as only one worker can process each queue.

The main downside being that if the processing of the popped data fails then there are no retries. However I don't think that's a big deal, and even if it is can work around that by adding the data back to the beginning of the list, or into a failed queue.

tsteur commented 3 years ago

Thanks @uglyrobot the problem is less around redis but more about Matomo and how it tracks data etc. There's a related issue in core eg https://github.com/matomo-org/matomo/issues/6415 basically if two workers were to work on the same a queue and one worker was processing the second tracking request of a visit slightly faster than another worker does the first tracking request Matomo could store wrong data in its database and sometimes even additionally create multiple visits.

StevieKay90 commented 3 years ago

I seem to be having issue with the following command which is stopping me from executing this correctly;

./console queuedtracking:process --queue-id=X

When activating ./console queuedtracking:process --queue-id=0 specifically for queue-id=0, it doesn’t work, i get this error

ERROR [2020-07-06 09:10:58] 4700 Uncaught exception: C:\inetpub\wwwroot\vendor\symfony\console\Symfony\Component\Console\Input\ArgvInput.php(242): The “–queue-id” option requires a value.

It works for fine for “./console queuedtracking:process --queue-id=1”

is this a known issue or am i doing something incorrectly?

tsteur commented 3 years ago

@StevieKay90 could you send us the output of your system check see https://matomo.org/faq/troubleshooting/how-do-i-find-and-copy-the-system-check-in-matomo-on-premise/ ? The output should be anonymised automatically.

StevieKay90 commented 3 years ago

Thanks for quick response!

its here;

matomo_system_check.txt

StevieKay90 commented 3 years ago

Hi Thomas, i've just found out that if you set ./console queuedtracking:process --queue-id=00 it works, good help from the community!

One thing which is vexing me though is why queue=0 seems to the most full, its not evenly distributing the load. The other queues are just a handful of requests in but queue 0 has over 200 Is there a way to stop this?

image

tsteur commented 3 years ago

Thanks for this. I still can't reproduce it just yet. @sgiehl any chance you have a windows running with Matomo and can try to reproduce this? I'm wondering if it's maybe windows related.

sgiehl commented 3 years ago

@tsteur don't have a matomo running directly on windows. But I could check if my Windows VM where I had set this up once is still running. But I guess it's already outdated and I would need to set it up again. Let me know if it's important enough to spend time on it.

StevieKay90 commented 3 years ago

Hi all @tsteur @sgiehl thanks for taking a look into this

as you can see its quickly is becoming a big problem here for me, i'm going to have to stop queued tracking

image

This has happened since the upgrade, previously i haven't run into this issue. Any interim advice would be great

tsteur commented 3 years ago

@StevieKay90 could you remove the queue-id parameter? Then the requests in the first queue should get processed

StevieKay90 commented 3 years ago

@tsteur I have done, i'm not using command line at all now i'm using the "Process during tracking request option" It just seems to heave the vast majority of requests into one queue and as its one worker at a time, it ant handle all the requests in the queue

tsteur commented 3 years ago

@StevieKay90 it will likely catch up and process these requests. If otherwise overall it always pushes more requests into the first queue that might be if a lot of the requests are coming from the same IP address or a lot of them use the same visitorId or userId (if userId feature is used). It's possible that simply the visits in the queue 0 weren't processed in the past because of the error you were getting

tsteur commented 3 years ago

btw you could maybe also try --queue-id="0" not sure if that makes a difference in Windows

StevieKay90 commented 3 years ago

@tsteur the command --queue-id=00 seems to work on windows to process queue 0. However this problem i'm now suffering from is way deeper (i thought this was the issue like you but now i don't think it is). Previously, not stating an ID did actually process queue 0- its just that a) queue 0 seemed to be much bigger and also B), write speed gets really slow as the redis db grew, takingsomething like 500 records in 2 minutes, i've got pretty high spec servers so that was surpising. it could never clear it all, and it reached massive levels until redis choked. so now i'm think was it an error in the upgrade or a software config thing

StevieKay90 commented 3 years ago

@tsteur Ok, i've done some research and have some very interesting findings!

Forcing queue ID: 0 : This worker finished queue processing with 3.2req/s (150 requests in 46.91 seconds) Forcing queue ID: 1 : This worker finished queue processing with 39.01req/s (125 requests in 3.20 seconds) Forcing queue ID: 2 : This worker finished queue processing with 42.12req/s (150 requests in 3.56 seconds) Forcing queue ID: 3 : This worker finished queue processing with 38.92req/s (125 requests in 3.21 seconds) Forcing queue ID: 4 : This worker finished queue processing with 44.05req/s (100 requests in 2.27 seconds) Forcing queue ID: 5 : This worker finished queue processing with 39.85req/s (125 requests in 3.14 seconds)

So its not that the there is more requests being routed to Queue ID 0 - its just the computing time of this specific queue is incredibly slow in comparison to the others!

UPDATE

I now opted for 16 workers as i figured that the relative speed of the other 15 would counter balance that of the slow moving queue 0.

image

However - Now queue 0 is performing a lot better (figuratively speaking at about 12-20req/s) but queue number 6 is now the naughty boy! There was nothing especially wrong in the verbose process output when i processed this queue manually, only the fact that it was slow and i could read most of the lines as they went by when normally its just a black and white fuzzy blur.

tsteur commented 3 years ago

@StevieKay90 any chance our using our log analytics for example to track / import data? This would explain that more requests go into the first queue and that it's slower since every request might consist of multiple tracking requests. Or in case you do custom tracking with bulk tracking requests that would explain it too.

That another queue might have now more entries be likely expected if you're not using the regular JS tracker. Be great to know how you track the data @StevieKay90

StevieKay90 commented 3 years ago

Thanks for the response thomas. All data is from the regular JS tracker.

It looks like I’m going to have to return to matomo 3 to check if it was the upgrade which changed the queued tracker process.

Currently with QT s weir Ged on I eventually get a pool of data in a queue which can’t be cleared fast enough and without QT I get a lot of strain on the db server

On Thu, 11 Mar 2021 at 20:08, Thomas Steur @.***> wrote:

@StevieKay90 https://github.com/StevieKay90 any chance our using our log analytics for example to track / import data? This would explain that more requests go into the first queue and that it's slower since every request might consist of multiple tracking requests. Or in case you do custom tracking with bulk tracking requests that would explain it too.

That another queue might have now more entries be likely expected if you're not using the regular JS tracker. Be great to know how you track the data @StevieKay90 https://github.com/StevieKay90

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/matomo-org/plugin-QueuedTracking/issues/134#issuecomment-797015011, or unsubscribe https://github.com/notifications/unsubscribe-auth/ATFEZHWPNXTY2R7PGTA5HKDTDEPKRANCNFSM4NBIIKTQ .

tsteur commented 3 years ago

Let us know how you go with the downgrade to Matomo 3. Generally, there wasn't really any change though in queued tracking so I don't think it would make a difference. Be interesting to see though.

StevieKay90 commented 3 years ago

@tsteur is queued tracking compatible with php 8? out of interest?

tsteur commented 3 years ago

AFAIK it should be @StevieKay90

bitactive commented 1 year ago

Hi,

we are using QueuedTracking on 3 frontend servers, each with 24core and backend DB+redis with 128core+1TB RAM. We are tracking single website with billion of monthly pageviews. DB has little workload, redis has ~24% CPU core.

Having 16 queues, 10 requests per batch, processing 6 queues on 1st frontent and 5 queues on second and third frontend, each queue processor is hitting ~80% cpu, but frontend servers still have spare CPU power. Is it possible to increase number of queues beyond 16 to get even more performance? We have already written start scripts for queue processors so they immediately restart after reaching NumberOfMaxBatchesToProcess and do not wait for cron to restart for remaining seconds until full minute.

Do you have any other advices to increase QueuedTracking capacity here?

snake14 commented 1 year ago

Hi @bitactive. I'm sorry you're experiencing issues. Sadly, 16 is currently the maximum number of queues supported. You could try adjusting the number of requests processed in each batch. I believe that the default is 25. Any other recommendations @AltamashShaikh ?

AltamashShaikh commented 1 year ago

@bitactive We would recommend to increase the no of requests here

bitactive commented 9 months ago

@snake14 @AltamashShaikh Increased no of requests from 10 per batch to 25 per batch. Now each of 16 workers have ~80% CPU and increased total throughput (processed requests per second) by ~15%. Still not able to process queue in realtime during high hours with 16 workers, each at 80% CPU on 3.8GHz cores.

What are further possible steps to increase efficiency, e.g. by an additional 100%? We do track one big website and have nearly unlimited resources for this (machines / CPU cores / memory).

AltamashShaikh commented 9 months ago

@bitactive What if you change the no pf requests to 50 ?

bitactive commented 8 months ago

@snake14 @AltamashShaikh Changing requests per batch to 50 gives another 10-15% throughput increase. Will try 100 soon as traffic increase.

Meantime i have another question for this configuration.

If i would like to add second big project to this Matomo instance, is it possible to configure it so for example matomo project #1 will use redis queue 0 and matomo project #2 will use redis queue 1 and then run 16 workers for queue 0 and 16 workers for queue 1?

As far as i know different matomo projects can be processed independently so it should be possible to direct requests from one project to one redis queue and from second project to another redis queue and then process them independently by another 16 workers?

snake14 commented 8 months ago

Hi @bitactive . I'm glad that helped. As far as I can tell, each Matomo instance would need a separate Redis database. Can you confirm @AltamashShaikh ?

AltamashShaikh commented 8 months ago

@bitactive You can specify the database if you want to use the same Redis for 2 instances Screenshot from 2024-02-26 06-53-24