nasa / trick

Trick Simulation Environment. Trick provides a common set of simulation capabilities and utilities to build simulations automatically.
Other
34 stars 19 forks source link

send_hs run stats I slightly bigger picture improvement. #1592

Closed amishscientist closed 10 months ago

amishscientist commented 11 months ago

I did write and close this issue. I still think it is a valid issue, but I think it would be best to show a bigger picture way to get the data people are after.

The issue with the rusage data is that it can be inadvertently misleading. To me it would be useful for trick to actually keep track of the amount of time the user jobs took and output it with the rusage data.

Here is an example of a sim. ROUTINE: Executive_loop_single_thread.cpp:98 DIAGNOSTIC: Reached termination time

   SIMULATION START TIME:        0.000
    SIMULATION STOP TIME:      500.000
 SIMULATION ELAPSED TIME:      500.000
    ACTUAL CPU TIME USED:       72.934
   SIMULATION / CPU TIME:        6.856
 INITIALIZATION CPU TIME:        2.303

But when you actually look into the job data you see this JOB_trick_real_time.rt_sync.rt_monitor.13.09(end_of_frame) 275.8200370 seconds. Which means it actually 2.23 times faster than real time not 6.856 . The reason for this is because when I look further into the job data You will see this job. JOB_THLA.federate.wait_for_time_advance_grant.48.14(environment_0.008) took 120.9328370 seconds it is an I/O job that isn't polling. That means it is artificially distorting the number by sleeping the thread while waiting on the data.

Not to mention that turning off rt_nap makes the number in rusage less useful.

What I am asking for is

Given all three it would make it much easier to track down when people have calls that are sleeping the thread on the main thread. Which happens a lot more often than you think.

Had a user seen trick reporting 2.23 vs the system reporting 6.856 It would help troubleshoot these issues much faster. Also, I don't want to just use the rt_sync.rt_monitor job time because I would like the this to give us this information when not running real time.

hchen99 commented 10 months ago

The stats look this after the update:

ROUTINE: Executive_freeze_loop.cpp:85 DIAGNOSTIC: Sim control Shutdown

       SIMULATION START TIME:        0.000
        SIMULATION STOP TIME:        2.500
     SIMULATION ELAPSED TIME:        2.500
          USER CPU TIME USED:        0.300
        SYSTEM CPU TIME USED:        0.102
       SIMULATION / CPU TIME:        6.223
INITIALIZATION USER CPU TIME:        0.267

INITIALIZATION SYSTEM CPU TIME: 0.084 SIMULATION RAM USAGE: 71.039MB (External program RAM usage not included!)