Closed sergicuen closed 2 years ago
The application runtime with the error injection tool will depend on the kernel being instrumented. The best way is what you suggested - profile a dummy error injection campaign (using https://github.com/NVlabs/nvbitfi/blob/master/injector/Makefile#L16). The scripts currently assume that the application runtime with instrumentation (of one kernel in the application) will be 10x the uninstrumented runtime. See https://github.com/NVlabs/nvbitfi/blob/master/scripts/params.py#L31. If you are using the profiled run from an injection campaign, you may want to adjust this threshold to 2x (i.e., a hang will be detected if the application is running 2x longer than anticipated).
Hi all, i am trying to estimate the Expected_runtime to define the Timeout fault. Usually the Expected_runtime measured during the normal_execution of the application is much shoter than the runtime measured by the tool when injecting faults (I guess that nvbitfi instrumentation is the cause of the delay) .
E.g: Inj_count=1, App=mEle_Sz256_Blk32, Mode=inst_value, Group=7, EM=0, Time=83.747101, Outcome: Masked: other reasons
So using the normal_execution time in the list of apps of params.py produces a lot of Timeouts. The other option is to use the maximum Time obtained in a DUMMY campaign. However in that case there are no Timeouts in the results. What is the right way to estimate the Expected runtime? Thank you in advance