AVSLab / basilisk

Astrodynamics simulation framework
https://hanspeterschaub.info/basilisk
ISC License
143 stars 61 forks source link

Message Recorder has processing time spikes that grow with simulation duration #788

Open sassy-asjp opened 2 months ago

sassy-asjp commented 2 months ago

Describe the bug The time Message Recorders take to execute UpdateState will regularly spike every 2^n steps for a time that grows at a rate of 2^n e.g., step 16385 may take 4ms, step 32769 may take 8ms, and so on. While the exponentially decreasing frequency of such spikes compensates for their exponential increase in time in non-real-time simulation use, the behavior may pose a problem for real-time simulation use (e.g., hardware-in-the-loop-simulation testing).

This is because Recorder stores messages using STL vector which, when it runs out of space, reallocates a new array of double the size, then copies all the existing elements over.

To reproduce Steps to reproduce the behavior:

  1. Set up a message and Recorder
  2. Run the simulation

Expected behavior The time Recorder::UpdateState takes to run should not grow with simulation duration.

Desktop (please complete the following information):

Additional context This could probably be solved by switching to STL deque for storage. This would marginally increase average UpdateState execution time, and general data access time, but unlikely enough noticeable in practice. It would significantly increase the time of anyone taking advantage of vector storage being contiguous, however, using Python to access to recorded data does NOT do this, and in fact, just makes a copy of all the data anyways.

It could also be solved by introducing a way to force the Recorder vectors to reserve, however this would be more difficult for simulations with indefinite end times, since they would have to reserve for the theoretical longest run time.

I would favor the deque solution.

juan-g-bonilla commented 2 months ago

Thank you for the issue @sassy-asjp . This was originally reported in #368 , which got closed by the original author as they found a workaround. Now that you bring it to attention again, we might consider whether we want to fix this within core Basilisk.

sassy-asjp commented 1 month ago

@juan-g-bonilla From the discussion on #368, it seems like they implemented my second suggested solution, a way to preallocate space in the vectors for recording. I think it has to be fixed within the core of Basilisk, with the difference whether it being a change made in a soft real time fork, or in the main open source project.

ephraim271 commented 1 month ago

@sassy-asjp hah we ended up with the exact same issue 😅 I have a follow up question - even with deque, wouldn't basilisk eventually run out of ram for indefinite simulations? Does regularly flushing to SSD make more sense?

sassy-asjp commented 1 month ago

@ephraim271 Yes Basilisk would eventually run out of RAM if you let the simulation go on for too long with too much data being recorded.

If a simulation longer than what there is RAM for is to be run, then a completely different data recording system that regularly flushes to disk must be implemented. I think that's beyond the scope of just fixing up the existing data recorders for soft real time usage though.

Such a system probably makes the most sense to implement separately from the core of Basilisk. Flushing data to disk is a pretty common source of latency which can destroy soft real time performance, and the right design for recording large volumes of data to disk continuously is often pretty specific to the application.

juan-g-bonilla commented 1 month ago

Just brainstorming here, but a good solution would be make the data recording (and logging) "backends" swappable/configurable. The recorders (and loggers) do two things: extract data from the simulation and push this data into some container for later access. Right now, we push to std::vector (and list), but we could abstract away this step so that users can come up with their own implementations (pushing to a deque for consistent timing, pushing to a file for memory concerns...)

sassy-asjp commented 1 month ago

@juan-g-bonilla That sounds like a cool idea, but I wonder how adding a custom one would work, since it would have to be compiled in with the Recorder template and have some separate Python interface in SWIG.