observe/observeChanges without initial set of documents

MeteorCommunity / discussions

Track technical discussions in the Meteor community

89 stars 7 forks source link

observe/observeChanges without initial set of documents #30

Open mitar opened 9 years ago

mitar commented 9 years ago

Currently observe/observeChanges always go through all documents first, before continuing observing. This is problematic on huge collections. Imagine that you do .find({}).observe on a collection with millions of documents. Sometimes you do not need all those documents, you would just want to process changes from that point on.

This can be also used to help improving counting performance, that instead of going through all documents initially, one could first to a simple .count() and then just keep the count up to date by observing changes and adjusting the count.

It would be also useful for PeerDB where we are observing the database for changes for reactive relations between documents.

There were quite some discussion about this in various corners of Meteorlandia but I am not able to find them at the moment. One was about effective counting of documents.

dandv commented 9 years ago

The counts-by-room example skips publishing updates for the initial document set, but it still receives callbacks for all the initial documents; it just skips sending updates.

tmeasday commented 9 years ago

I'm not sure this really works @mitar. The oplog driver has to fetch those documents anyway to deal with timing issues on the oplog (I believe). See my comments in #31

mitar commented 9 years ago

The counts-by-room example skips publishing updates for the initial document set, but it still receives callbacks for all the initial documents; it just skips sending updates.

Yes, but that takes already a long long time when having large queries.

The oplog driver has to fetch those documents anyway to deal with timing issues on the oplog (I believe).

So for counting that is maybe true. (It is question if you care about that few documents off when you have count in 100000s, though.) But for other uses you might not care about previous state, just changes (so that you do not have to parse the oplog yourself again).

BTW, if count is in 100000s, that probably also means that there are many changes to this collection, so sending count updates for each of them is probably not the best.

moonrockfamily commented 6 years ago

Mitar; Did you come up with a work-around to the startup performance issue? Our collection has grown and our application is taking nearly 3 hours to startup while it finishes the initial observe!

mitar commented 6 years ago

Sadly not.

mitar commented 6 years ago

(Feel free to upvote this issue.)

moonrockfamily commented 6 years ago

Do you know of any forks/hacks of the Meteor Mongo that we can swap in?

raix commented 5 years ago

@mitar crazy that this issue is 4 years old in 5 days... As I understand the issue you basically don't want to load all the documents - just hook into the oplog / query watcher right?

moonrockfamily commented 5 years ago

PeerDB references

Here are the current observing use cases causing initialization latency: https://github.com/peerlibrary/meteor-peerdb/blob/master/lib.coffee#L763 https://github.com/peerlibrary/meteor-peerdb/blob/master/lib.coffee#L781 https://github.com/peerlibrary/meteor-peerdb/blob/master/lib.coffee#L891 https://github.com/peerlibrary/meteor-peerdb/blob/master/lib.coffee#L987

workaround

Hack the collection find to include a query/filter that only matches new/updated documents. eg. updatedOn : { $gte: timeInRecentPast }

mitar commented 5 years ago

@raix Yes. I only want/need reactivity part/API of the oplog, not the initial state of the documents.