pulsecron / pulse

The modern MongoDB-powered job scheduler library for Node.js
https://pulsecron.com
MIT License
94 stars 4 forks source link

Enhance Job Query Performance by Adding Indexed Composite Field for Efficient Lookups #45

Open fermentfan opened 1 month ago

fermentfan commented 1 month ago

I think it's crucial to have a way to query jobs in a more performant way. Here is an example scenario to illustrate the problem:

Example Situation:

A customer books an appointment (Booking entity) for a date 4 weeks in the future. I want to notify the customer 1 week before the appointment starts. To achieve this, I create a job to send the notification at the appropriate time.

If the organizer needs to cancel the appointment due to personal reasons, I would issue a refund and take necessary actions. Additionally, I need to cancel the previously created notification job.

Currently, to achieve this, I would query the metadata field in the jobs collection of MongoDB. However, this field is not indexed, and adding an index to the metadata field could be too expensive in terms of performance and storage costs. This would lead to high costs and inefficiencies when handling a large collection of data, as it would result in MongoDB performing a full collection scan.

Proposed Solution:

In my experience with NoSQL databases, a common approach is to create a single field consisting of attributes that are commonly queried. For instance, we can concatenate the customer ID and booking ID into a single field:

facebook|1234567890;a1a2b8e2-11b1-48d0-adb7-d4647a3e424d

This composite field should be indexed to allow fast querying. Combined with the job name, this would enable very specific and efficient lookups with a single index.

Benefits:

Implementing this solution would greatly enhance the efficiency and scalability of Pulse.

Note: one could of course make use composite indexes with multiple fields instead of this composite field, but the Pulse API might get too complicated then when one needs to open the whole field and indexing API to the configuration properties of this dependency I think.

code-xhyun commented 1 month ago

@fermentfan In the current structure, the best option is to change the disableAutoIndex option to false, which would add the metadata information to the index. While your suggestions seem quite good, it appears difficult to implement such changes immediately based on our current standards.