…some error prone behavior in streaming and certain tiling patterns, as the leading zero is only correct if mpx is run starting from index 0 rather than some other point in the time series.
The tolerance switch to e-8 is also present. I found in rare cases, e-10 was too tight here. Typically absolute tolerance would be low, but relative tolerance would be just over the threshold on 1-2 elements out of roughly 16000. FMA might make a difference, but we aren't using it here.
…some error prone behavior in streaming and certain tiling patterns, as the leading zero is only correct if mpx is run starting from index 0 rather than some other point in the time series.
The tolerance switch to e-8 is also present. I found in rare cases, e-10 was too tight here. Typically absolute tolerance would be low, but relative tolerance would be just over the threshold on 1-2 elements out of roughly 16000. FMA might make a difference, but we aren't using it here.