nasa / trick

Trick Simulation Environment. Trick provides a common set of simulation capabilities and utilities to build simulations automatically.
Other
34 stars 19 forks source link

Trick::ScheduledJobQueue::push has O(n) complexity #1693

Closed iamthad closed 2 months ago

iamthad commented 5 months ago

Trick::ScheduledJobQueue::push scans linearly through a sorted list and copies items one by one.

This takes O(n) time and can be a significant bottleneck when there are many jobs.

The scan can be replaced with std::upper_bound and a suitable comparison function, making the complexity O(log n).

It also always allocates a new array of JobData*, so the entire array is always copied. This could be avoided by using realloc and memmove, resulting in fewer copies of larger chunks of memory.

EDIT: Trick::ScheduledJobQueue::remove has the same issue, but it's less impactful because removing jobs is less common than creating them.

EDIT 2: It actually has to be std::upper_bound instead of std::lower_bound in order to preserve insertion order.

sharmeye commented 5 months ago

Thank you for your contribution. We will take this under advisement. If there is a particular use case in which this condition causes performance issues, please let us know. We may incorporate this update in a future patch, time permitting.

iamthad commented 5 months ago

I observed this issue causing substantial performance issues a couple of years back, in a sim with a huge number of scheduled jobs (tens of thousands, if I remember correctly) due to dynamically-created sim objects. Here is a portion of a profiling result:

# Overhead  Command          Shared Object                                      Symbol
# ........  ...............  .................................................  .........................................
#
    60.17%  S_main_Linux__x  S_main_Linux__x86_64.exe                           [.] Trick::ScheduledJobQueue::push
    11.52%  S_main_Linux__x  S_main_Linux__x86_64.exe                           [.] Trick::ScheduledJobQueue::find_next_j
     3.23%  S_main_Linux__x  S_main_Linux__x86_64.exe                           [.] (proprietary function name redacted)
     2.81%  S_main_Linux__x  S_main_Linux__x86_64.exe                           [.] Trick::JobData::call
     2.22%  S_main_Linux__x  S_main_Linux__x86_64.exe                           [.] Trick::DataRecordGroup::call_function
     2.21%  S_main_Linux__x  libc-2.27.so                                       [.] __memset_avx2_unaligned_erms
     1.03%  S_main_Linux__x  S_main_Linux__x86_64.exe                           [.] Trick::Executive::loop_single_thread

When I noticed this, I introduced the first version of the patch I submitted as #1694 (using lower_bound), and it made a huge difference. We have been using that patch internally since then; the issue with preserving insertion order that I resolved by switching to upper_bound was not a problem for us.

I recently obtained permission to submit the patch here in hopes that others would benefit from it.

sharmeye commented 5 months ago

Ok, I asked for an example and you gave us one. Hard to argue with that! We'll review the PR and merge it once we're satisfied it works as intended and credit you with the change.

iamthad commented 5 months ago

OK, sounds good. I updated the source branch to tweak formatting and to update the function's comment to reflect the new behavior.