robfallows / tunguska-reactive-aggregate

Reworks jcbernack:reactive-aggregate for ES6/ES7 and Promises
MIT License
51 stars 27 forks source link

Using APM trace explorer, observeChanges on collection takes awfully long time #77

Open lc3t35 opened 10 months ago

lc3t35 commented 10 months ago

Using APM on Galaxy, I see that I have some aggregate requests that take a lot of time before responding (more than 6s) which is not good

Here is the trace explorer, we can see that it is the observeChanges on moments that is the root cause of this long time :

Started at Monday, Dec 18, 2023 15:05:12 
wait time 0ms on computation 1ms
find on moments 0ms 
find on contacts 0ms 
computation 1ms
observeChanges on moments 9372ms Using Oplog 

    coll : moments
    selector : {"isDeleted":false}
    func : observeChanges
    cursor : true
    oplog : true
    wasMultiplexerReady : false
    queueLength : 0
    elapsedPollingTime : 0
    noOfCachedDocs : 1565

observeChanges on contacts 1ms Using Oplog 
observeChanges on users 110ms Using Oplog 
computation 1ms
async 182ms
Completed at Monday, Dec 18, 2023 15:05:21 (Response Time: 9670ms)

and the request :

set some selector, some userId ...

ReactiveAggregate(
    this,
    Moments,
    [
      {
        $lookup: {
          from: 'contacts',
          pipeline: [
          ....
          ],
          as: 'contactFollowings',
        },
      },
      {
        $addFields: {
          contacts: {
            ....
        },
      },
      {
        $match: {
          $expr: {
            $and: selector,
          },
        },
      },
      {
        $sort: { created: -1 },
      },
      {
        $limit: limit,
      },
      { $project: { contacts: 0, contactFollowings: 0 } },
      {
        $lookup: {
          from: 'users',
          let: { userId: '$userId' },
          pipeline: [
           ....
          ],
          as: 'user',
        },
      },
      {
        $unwind: {
          path: '$user',
        },
      },
      {
        $facet: {
          stage1: [ ... ],
          stage2: [ ...],
        },
      },
      { $addFields: { final: { $concatArrays: ['$stage1', '$stage2'] } } },
      { $project: { stage1: 0, stage2: 0 } },
      {
        $unwind: {
          path: '$final',
        },
      },
      {
        $replaceRoot: {
          newRoot: '$final',
        },
      },
    ],
    {
      noAutomaticObserver: true,
      debounceDelay: 100,
      observers: [
        Moments.find({ isDeleted: false }),
        Contacts.find({ userId }),
        Meteor.users.find({}),
      ],
    }
  );
});

Checklist of what I already verified :

Welcome to any idea on how to fix this ! Thank you !

lc3t35 commented 8 months ago

"Right now, each publish function blocks all future publishes and methods waiting on data from Mongo (or whatever else the function blocks on). This probably slows page load in common cases."

Comment from ddp-server/livedata_server.js (1148) - meteor 2.14 (https://github.com/meteor/meteor/blob/release/METEOR%402.14/packages/ddp-server/livedata_server.js)

lynchem commented 7 months ago

Hey @lc3t35 , I'm new to this package but in testing observed one pub taking 6-7 secs whereas another takes just some ms. I tried then specifying the exact observer I wanted created and that made both take just ms. I didn't checkout the code but sort of thought it might infer the selector from the match stage of the pipeline ? But maybe I was expecting too much magic 😄

If you follow the example from the docs it might help?:

Meteor.publish("biographiesByWelshAuthors", function () {
  ReactiveAggregate(this, Authors, [{
    $lookup: {
      from: "books",
      localField: "_id",
      foreignField: "author_id",
      as: "author_books"
    }
  }], {
    noAutomaticObserver: true,
    debounceCount: 100,
    debounceDelay: 100,
    observers: [
      Authors.find({ nationality: 'welsh'}),
      Books.find({ category: 'biography' })
    ]
  });
});
robfallows commented 7 months ago

...thought it might infer the selector from the match stage of the pipeline ? But maybe I was expecting too much magic...

The code does no inspection of the pipeline, so yes, too much magic 😆

lc3t35 commented 7 months ago

@filipenevola you see I'm not alone ! @lynchem can you be more specific ? as far as I can read, the example from the docs does not give more information -> what did you do in your case to solve the issue, what differs from my observers ?

@robfallows I'm sorry but we are not magicians, we solve real issues in the code, so how can we fix or workaround the "each publish function blocks all future publishes and methods waiting on data from Mongo" ? or is there a better way to state the observers ?

lynchem commented 7 months ago

How big is your moments collection @lc3t35 ? That's a pretty broad selector you're using so I think it's normal that creating an observer for a lot of docs will take a long time. In our case we noticed it on a larger collection and specificed the observer ourselves with quite a restrictive selector.

And I think you're misinterpreting Rob's comment. I was (wrongly) expecting some magic, nothing else.

jamauro commented 7 months ago

@lc3t35 Try using $match first before $lookup to improve performance. Also I'm guessing it would help if you could narrow your observer on Meteor.users or eliminate it if possible.

lc3t35 commented 7 months ago

How big is your moments collection @lc3t35 ?

@lynchem thank you for your comment -> just around 2000 docs, so very small ;)

Try using $match first before $lookup to improve performance

@jamauro thank you for your comment but $match depends on $contacts added field, I could run a findOne before the aggregate and match with the resulting array, but the issue is not here (quite no time spent here : 20ms), the major problem is the observers -> seconds of delay ! If I remove Meteor.users, what is the benefit of Meteor if you get rid of the reactivity ?

The root cause is in ddp-server/livedata_server.js -> how can we unblock the requests here ?