I'm sending some significant enhancements to optimize the script's performance. Here are the main changes implemented:
Streaming Query:
Instead of using the skip() function to query the database, we're now leveraging the power of streaming to retrieve data as it's being read. This eliminates query overhead and provides a continuous flow of data in a single query.
Batch Insertion:
To improve data insertion performance, we're now inserting data in batches.
Furthermore, we continue to apply the concepts of child processes. However, this time, I've implemented a dedicated child process solely for data querying. Once a list of items is accumulated, it's sent to other processes for parallel insertion, maximizing the benefits of parallelization already incorporated previously.
Unfortunately, this is part of the source code provided in the video, I can't change the code as people would use it in case of any problem. Thanks a lot for the contibution!!
Hi,
I'm sending some significant enhancements to optimize the script's performance. Here are the main changes implemented:
Streaming Query: Instead of using the skip() function to query the database, we're now leveraging the power of streaming to retrieve data as it's being read. This eliminates query overhead and provides a continuous flow of data in a single query.
Batch Insertion: To improve data insertion performance, we're now inserting data in batches.
Furthermore, we continue to apply the concepts of child processes. However, this time, I've implemented a dedicated child process solely for data querying. Once a list of items is accumulated, it's sent to other processes for parallel insertion, maximizing the benefits of parallelization already incorporated previously.