Closed gfoidl closed 6 years ago
Linux ubuntu 4.4.0-62-generic #83-Ubuntu SMP Wed Jan 18 14:10:15 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 4
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 158
Model name: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz
Stepping: 9
CPU MHz: 2808.002
BogoMIPS: 5616.00
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 6144K
NUMA node0 CPU(s): 0-3
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc pni pclmulqdq ssse3 cx16 sse4_1 sse4_2 x2apic movbe popcnt aes xsave avx rdrand hypervisor lahf_lm abm 3dnowprefetch avx2 rdseed clflushopt
0 0.815724735289837
3 0.826191062791331
4 0.829686811399171
1 0.819209958020078
2 0.822698756058058
3 0.826191062791331
0 0.815724735289837
2 0.822698756058058
1 0.819209958020078
3 0.826191062791331
2 0.822698756058058
0 0.815724735289837
2 0.822698756058058
2 0.822698756058058
3 0.826191062791331
2 0.822698756058058
5 0.833185934856398
2 0.822698756058058
3 0.826191062791331
999 2.15247343841125E-05
Linux 143fb24f23d3 4.4.0-111-generic #134~14.04.1-Ubuntu SMP Mon Jan 15 15:39:56 UTC 2018 x86_64 GNU/Linux
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0-31
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 2
NUMA node(s): 2
Vendor ID: GenuineIntel
CPU family: 6
Model: 62
Model name: Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz
Stepping: 4
CPU MHz: 2800.076
BogoMIPS: 5600.15
Hypervisor vendor: Xen
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 25600K
NUMA node0 CPU(s): 0-7,16-23
NUMA node1 CPU(s): 8-15,24-31
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx rdtscp lm constant_tsc rep_good nopl xtopology eagerfpu pni pclmulqdq ssse3 cx16 pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm kaiser fsgsbase smep erms xsaveopt
0 0.815724735289837
3 0.826191062791331
4 0.829686811399171
1 0.819209958020078
2 0.822698756058058
3 0.826191062791331
0 0.815724735289837
2 0.822698756058058
1 0.819209958020078
3 0.826191062791331
2 0.822698756058058
0 0.815724735289837
2 0.822698756058058
2 0.822698756058058
3 0.826191062791331
2 0.822698756058058
5 0.833185934856398
2 0.822698756058058
3 1
999 1
The problem goes down to
Standard Output Messages:
at System.Environment.get_StackTrace()
at gfoidl.Stochastics.SpecialFunctions.Erfc(Double[] values) in /root/repo/source/gfoidl.Stochastics/SpecialFunctions.cs
Where the values for tsusSqrt2Inv
are at CircleCI:
0.164789272799277
0.155273049153506
0.152100974604915
0.161617198250687
0.158445123702096
0.155273049153506
0.164789272799277
0.158445123702096
0.161617198250687
0.155273049153506
0.158445123702096
0.164789272799277
0.158445123702096
0.158445123702096
0.155273049153506
0.158445123702096
0.148928900056325
0.158445123702096
0
0
So the last two 0
result in (correct) Erfc -> 1. This means the problem is in https://github.com/gfoidl/Stochastics/blob/9c22392e49e50647339482c4add12804cfb30058/source/gfoidl.Stochastics/Statistics/OutlierDetection/ChauvenetOutlierDetection.cs#L101
When https://github.com/gfoidl/Stochastics/blob/9c22392e49e50647339482c4add12804cfb30058/source/gfoidl.Stochastics/Statistics/OutlierDetection/ChauvenetOutlierDetection.cs#L114 is set to false
then the test passes.
OK, found the problem.
On the other test-platforms Vector<double>.Count
was 4, and the size of the test-array is 20. So with 8x + 1x the whole array is processed. The normal for-loop isn't touched -> problem does not show up.
On CircleCI Vector<double>.Count
is 2, so the normal for-loop gets executed and it indexes on the pointers that get incremented in the simd-loops. So the access is far outside the array-bounds. unsafe
caught me 😜 (i.e. with safe code there would have been an IndexOutOfRangeException)
So two possibilities:
for (; i < n; ++i)
*tsus++ = Abs(*zTrans++) * Sqrt2Inv;
or
for (; i < n; ++i)
pTarget[i] = Abs(pSource[i]) * Sqrt2Inv;
Additionally there should be a testcase that forces to use the normal-loop, i.e. an array size that is not a multiple of simd-size.
Closing as https://github.com/gfoidl/Stochastics/pull/38 is merged.
See https://github.com/gfoidl/Stochastics/pull/36#issuecomment-360477559
This test failes on CircleCI, on all other tests platforms it is OK:
https://github.com/gfoidl/Stochastics/blob/9c22392e49e50647339482c4add12804cfb30058/tests/gfoidl.Stochastics.Tests/Statistics/OutlierDetection/ChauvenetOutlierDetectionTests/GetOutliers.cs#L10-L22