Hey, I found a nasty vectorization bug. Look at this code:
define LEN 250000
let comp ttt1() =
times LEN {
x <- takes 8;
}
In
It should take 250000 * 8 = 2000000 items. And indeed it does, without vectorization:
EXTRAOPTS='' ../../scripts/preprocesscompile-vs.sh test.zir test.out
Total input items (including EOF): 2000000 (64000000 B), output items: 0 (0 B)
But with vectorise it doesn’t:
EXTRAOPTS='--vectorize --autolut --native-mitigators' ../../scripts/preprocesscompile-vs.sh test.zir test.out
Total input items (including EOF): 250000 (8000000 B), output items: 0 (0 B)
Basically, what seems to happen is that it vectorises times into an inner loop and an outer loop, but it takes only once per inner loop (instead of inner loop times)!
I’ve checked it in branch WiFi/cca in /tests/bugs (make sure you run ttt1() and not ttt() which doesn’t have the same bug as it doesn’t vectorise).
Hey, I found a nasty vectorization bug. Look at this code:
define LEN 250000
let comp ttt1() = times LEN { x <- takes 8; } In
It should take 250000 * 8 = 2000000 items. And indeed it does, without vectorization:
EXTRAOPTS='' ../../scripts/preprocesscompile-vs.sh test.zir test.out Total input items (including EOF): 2000000 (64000000 B), output items: 0 (0 B)
But with vectorise it doesn’t:
EXTRAOPTS='--vectorize --autolut --native-mitigators' ../../scripts/preprocesscompile-vs.sh test.zir test.out Total input items (including EOF): 250000 (8000000 B), output items: 0 (0 B)
Basically, what seems to happen is that it vectorises times into an inner loop and an outer loop, but it takes only once per inner loop (instead of inner loop times)! I’ve checked it in branch WiFi/cca in /tests/bugs (make sure you run ttt1() and not ttt() which doesn’t have the same bug as it doesn’t vectorise).