Particular / NServiceBus.SqlServer

SQL Server Transport for NServiceBus
https://docs.particular.net/nservicebus/sqlserver/
Other
42 stars 35 forks source link

High CPU using NSB with Postgres #1412

Open crirusu opened 2 weeks ago

crirusu commented 2 weeks ago

Describe the bug

Description

We deployed 9 net core services using NSB with Postgres which used a database on an amazon server with 48 acu on aurora postgress serverless v2. According to Amazon 1 ACU = 2GB RAM https://docs.aws.amazon.com/AmazonRDS/latest/AuroraUserGuide/aurora-serverless-v2.html

Even with few data to be processed CPU did not dropped, and we were unable to keep up with processing. I am attaching a picture with a very high number of commits and rollbacks. image

With the database for the queues on a MSSQL Server with 8 cores and 32 GB of RAM we can run all services with less than 30% CPU.

Expected behavior

Actual behavior

Versions

Please list the version of the relevant packages or applications in which the bug exists.

Steps to reproduce

Try to see how many requests are actually done on the database.

Relevant log output

No response

Additional Information

Workarounds

Possible solutions

Additional information

SzymonPobiega commented 2 weeks ago

Hi

For some reason I can't open the picture in full resolution but as far as I can tell, you get 10-20 times more calls for the DELETE than the rows affected. Is that correct?

Could you describe also the specifics of your deployment i.e. how many instances there is of each logical endpoint. In order to keep that data confidential, please send up an email to the support address.

crirusu commented 2 weeks ago

Hi,

We have 9 services but some have 2 NSB endpoints

Service 1 - 1 NSB endpoint with NServiceBusMaxConcurrency = 20 Service 2 - 1 NSB endpoint with NServiceBusMaxConcurrency = 32 and 1 NSB endpoint with NServiceBusMaxConcurrency = 18 Service 3 - 1 NSB endpoint with NServiceBusMaxConcurrency = 16 Service 4 - 1 NSB endpoint with NServiceBusMaxConcurrency = 16 and 1 NSB endpoint with NServiceBusMaxConcurrency = 25 Service 5 - 1 NSB endpoint with NServiceBusMaxConcurrency = 18 Service 6 - 1 NSB endpoint with NServiceBusMaxConcurrency = 1 and 1 NSB endpoint with NServiceBusMaxConcurrency = 1 Service 7 - 1 NSB endpoint with NServiceBusMaxConcurrency = 16 Service 8 - 1 NSB endpoint with NServiceBusMaxConcurrency = 16 Service 9 - 1 NSB endpoint with NServiceBusMaxConcurrency = 20

I can send the picture again on some email. The problem is that are too many commits and rollbacks. We now reverted to MSSQL Server instance and it is nowhere close to what was in postgres - compared with the number of transactions/ database activity.

SzymonPobiega commented 1 week ago

Yes, the number of commits is off but so is the ration of calls to rows affected. In the ideal scenario when queues always have some messages the number of DELETE calls should be equal to number of rows returned.

I am especially concerned about the Sub_ExportData endpoint. Is it one of the scaled-out endpoints? (2, 4 or 6).

Does the rows/s value of 50-80 match the expected number of messages per second these endpoints process?

crirusu commented 1 week ago

hi, i don't think we process that much. This is the picture with the mssql server that we are running the same endpoints as we did in postgresql. image