Some microarchitectures have instructions with 0-cycle latency, i.e., the can forward the result to an instruction in the same cycle (e.g., X+str on Cortex-M7) If that is the case we need to make sure that the consumer is after the producer in the output.
Prior to this change SLOTHY may actually produce wrong code in this case.
See https://github.com/slothy-optimizer/slothy/commit/a73d7505d67342e8b2deab03414e863673ad2751
Some microarchitectures have instructions with 0-cycle latency, i.e., the can forward the result to an instruction in the same cycle (e.g., X+str on Cortex-M7) If that is the case we need to make sure that the consumer is after the producer in the output. Prior to this change SLOTHY may actually produce wrong code in this case.