Multi-core processing was being done at a very fine grain level, each object and each frame were assigned individual processes. This turns out to be wildly inefficient, as the single checks were only a small amount of computation. The overhead for the multi-core support was dominating computation time for these large queries.
Processing 10k Mainbelt asteroids for the Cryo wise mission takes ~1.5 minutes on my desktop.
Processing 33k NEOs for the same takes 11 minutes, (NEOs are slower to propagate).
Discovery:
This was discovered when doing setup for final data processing of WISE, where I was computing the state of all known MPC objects for the entirety of the Cryo mission. A single object could be processed in far less than a single second, but doing 2 objects was taking ~20 seconds, this was clearly an issue. I systematically went through the multi-core calls, and found the specific instance where the speedup came from.
Solution:
The algorithm for computation of visible states was changed, the high level flow is now:
Inputs: All States, All FOVs
Break the FOVs into groups, where each group contains 3 days of FOVs.
Loop over each group doing the following:
a. Propagate all objects to the mean time of the group of FOVs.
b. Pass 100 FOVs off at a time along with the states to each multi-core thread
c. Allow it to chew through each 100 FOVs instead of 1 at a time. This makes the threading overhead a much smaller cost overall.
d. Join the results together into batched results
Join the results of the batches into the final returned visible states.
TLDR:
Multi-core processing was being done at a very fine grain level, each object and each frame were assigned individual processes. This turns out to be wildly inefficient, as the single checks were only a small amount of computation. The overhead for the multi-core support was dominating computation time for these large queries.
Processing 10k Mainbelt asteroids for the Cryo wise mission takes ~1.5 minutes on my desktop. Processing 33k NEOs for the same takes 11 minutes, (NEOs are slower to propagate).
Discovery:
This was discovered when doing setup for final data processing of WISE, where I was computing the state of all known MPC objects for the entirety of the Cryo mission. A single object could be processed in far less than a single second, but doing 2 objects was taking ~20 seconds, this was clearly an issue. I systematically went through the multi-core calls, and found the specific instance where the speedup came from.
Solution:
The algorithm for computation of visible states was changed, the high level flow is now:
Inputs: All States, All FOVs