Open breznak opened 4 years ago
CC @pepedocs
Alpine build is now provided by our Dockerfile
ExactOutput failing also on Arm (on Alpine using musl C)
[ RUN ] SpatialPoolerTest.ExactOutput
/usr/local/src/htm.core/src/test/unit/algorithms/SpatialPoolerTest.cpp:2100: Failure
Expected equality of these values:
columns
Which is: SDR( 200 ) 4, 6, 17, 85, 113, 125, 133, 153, 172, 173
gold_sdr
Which is: SDR( 200 ) 4, 64, 74, 78, 85, 113, 125, 126, 127, 153
[ FAILED ] SpatialPoolerTest.ExactOutput (3270 ms)
So apparently our C++-fu is not yet 100% deterministically secure. @dkeeney how did you pin-point the stuff in your former c++ SP investigations?
how did you pin-point the stuff in your former c++ SP investigations?
With a great deal of difficulty...
I first located the point (the cycle number) when they started to diverge. Then I inserted debug trace statements to try and identify the function in which they diverged in that cycle number (actually it turned out to be in the previous cycle). Lots of debug trace statements into a log file and performed a diff. Then I had some good luck in that I hit on the cause.
hmm.. this will be pain, could be a bug in libstdc, compiler,... I'm not sure how far do we want to pursue the multiplatform deterministic builds (well, identical builds, we could just have "deterministic results per platform") I suspect the problem in glibc/musl C, as the same err happens on amd64/arm64 on musl
With #736 all tests (incl. determinism) are passing on CI for MUSL (added its custom results), but the results are not the same for GLIBC and MUSL libc.
I have briefly tested Alpine build on Docker (alpine:amd64-latest-stable) and
SpatialPoolerTest.ExactOutput
Related #659