Consider dropping common blocks

For cases where the code generates more trace data than we can process

Improving loop detection/compression would help if the overhead outside of loops doesn't cause more problems than it solves.

We could also track how many times a block has been sent to the visualiser and stop sending it after so many identical executions.

Example of a Cryptowall 1 loader sample:

cryptowall1badloop

It doesn't do very much before it fills the trace buffers with a tight loop.

If we removed instrumentation from the worst offenders and re-enabled it when different execution moved to different blocks then we are going to get a lot closer to native performance.

Problems with this approach: We are sacrificing edge count, so accurate number on the heatmap are lost. drgat can send a notification that those blocks were too hot to handle though.

Bigger problem: If you remove instrumentation from multiple blocks with call [eax] terminators and one of them breaks the loop, the integrity of the control flow graph is compromised.

A softer approach would be to maintain instrumentation of the blocks but not send their tags to rgat until their target changes. This won't make drgat output much faster but it will stop us having a 500000+ item backlog in rgat.

ncatlin / rgat

Consider dropping common blocks #6

For cases where the code generates more trace data than we can process