nasa / nos3

NASA Operational Simulator for Small Satellites
Other
334 stars 71 forks source link

Performance Issues and NOS3 Time Driver Slowdown #285

Closed kevincbruce closed 1 week ago

kevincbruce commented 2 months ago

While taking performance metrics, we found that there is extreme slowdown in time driver (between 0.66 and 0.75 speed with one satellite, down to ~0.22 with 8 satellites, around ~0.33 with 4 satellites), and it is still not eating up as many resources as it could (stays at using less than one additional core and around 1-2GB additional RAM per extra craft). This likely means that something is slowing down time driver, as it waits for all apps to post before it ticks. With 1 satellite, it will drive at 0.92-1.02 speed if you attempt to run a 2x. Will gain metrics for running at 4 satellites and see if increasing speed would bring it to parity, but we should resolve it so that at 1x speed it will take what resources it needs to run as close to 1x as possible.

kevincbruce commented 2 months ago

Further notes:

kevincbruce commented 2 months ago

Also when speeding up to 32x, it went to 32.05x instead, and then when slowing back down, it was at 16.03x and 8.01x instead of 16x and 8x like the first time through.

sdunlap-afit commented 3 weeks ago

I also noticed this issue and did a bit of digging. Looking at time_driver.cpp, I believe at least part of the issue is related to the main loop in TimeDriver::run().

while(1){
     std::this_thread::sleep_for(std::chrono::microseconds(_real_microseconds_per_tick));

    [Do stuff]
}

This method doesn't take into account the amount of time [Do stuff] takes to execute. With a default sleep of 10ms, this could account for a significant percentage difference that only gets worse as you speed up the simulation (smaller _real_microseconds_per_tick values).

On the latest main branch with default config, I get an actual speed-up of 0.8.

attempted speed-up =  1.00
actual speed-up =  0.79

It would be better to measure how long [Do stuff] takes and sleep for _real_microseconds_per_tick - (now - last_time). This still wouldn't consider the overhead for the sleep call (or thread switching), but you could also wait in a loop to avoid the sleep call altogether. Even with this method, though, my system caps out at 4x speed, so there's something else at play. I made some other modifications to test this method, but here's a snippet. With this method, I get exact actual speed-up values below 4x speed.

while(1){
    do{
        gettimeofday(&_now, NULL);
     } while (time_diff() < _real_microseconds_per_tick);

     // Added variable for updating the display
     _last_time_diff = time_diff();  // Modified time_diff() to return us
     _then = _now;
}
jlucas9 commented 1 week ago

Merged into dev! Will continue to look for performance improvements and happy to accept recommendations!