monti-apm / monti-apm-agent

Monitor Meteor apps with Monti APM
https://montiapm.com
MIT License
45 stars 22 forks source link

Surprisingly high time spent in `Ntp._now` #49

Open radekmie opened 2 years ago

radekmie commented 2 years ago

Hi there! In apps with a relatively high number of published documents, the time spent in Ntp._now function is surprisingly high:

Ntp._now
at /app/bundle/programs/server/packages/montiapm_agent.js:1521
Self: 39266.2ms (7.3%)                Total: 39266.2ms (23.3%)

I've checked the source and I believe the Date.now() alone is not the cheapest but typeof is just making it worse. I understand where are those mitigations coming from, but maybe they could be done once? Or even through a configuration?

And to be clear: it happens only when a huge amount of oplog changes hits the server - otherwise, it's lower (but still clearly visible):

Ntp._now
at /app/bundle/programs/server/packages/montiapm_agent.js:1521
Self: 6577.6ms (1%)                     Total: 6577.6ms (8.4%)

Let me know if you'd need more info from my side!

zodern commented 2 years ago

Thanks for reporting this @radekmie. I looked into optimizing Ntp._now, and making it simply return Date.now() doesn't make much of a difference.

Are you able to check which functions were calling Ntp._now during the time it was spending a lot of time? Maybe we can find a way to reduce how often it is called.

radekmie commented 2 years ago

It's a direct descendant of PubsubModel.trackDocumentChanges and proto._handleOplogEntrySteadyOrFetching (the rest of it is _handleOplogEntrySteadyOrFetching, so Meteor internals). And here's a screenshot: Screenshot from 2022-03-17 11-07-26

Let me know if I can provide you with more details.

radekmie commented 2 years ago

Small update: enabling cultofcoders:redis-oplog reduces the time spent in Ntp._now by ~75% (from being the single slowest function down to being tenth).

afrokick commented 2 years ago

We can simplify Ntp._now to Date.now() because there are so many places where Date.now() and new Date().getTime() are using already without any problem.

alisnic commented 1 month ago

We are facing the same problem. _now is the dominant leaf in our cpu profiles

Screenshot 2024-09-26 at 14 16 01

We have redis-oplog actually enabled cc @radekmie