htm-community / htm.core

Actively developed Hierarchical Temporal Memory (HTM) community fork (continuation) of NuPIC. Implementation for C++ and Python
http://numenta.org
GNU Affero General Public License v3.0
148 stars 74 forks source link

Alpine build (musl) fails on determinism checks #707

Open breznak opened 4 years ago

breznak commented 4 years ago

I have briefly tested Alpine build on Docker (alpine:amd64-latest-stable) and

Related #659

breznak commented 4 years ago

CC @pepedocs

breznak commented 4 years ago

Alpine build is now provided by our Dockerfile

breznak commented 4 years ago

ExactOutput failing also on Arm (on Alpine using musl C)

[ RUN      ] SpatialPoolerTest.ExactOutput
/usr/local/src/htm.core/src/test/unit/algorithms/SpatialPoolerTest.cpp:2100: Failure
Expected equality of these values:
columns
Which is: SDR( 200 ) 4, 6, 17, 85, 113, 125, 133, 153, 172, 173
gold_sdr
Which is: SDR( 200 ) 4, 64, 74, 78, 85, 113, 125, 126, 127, 153
[  FAILED  ] SpatialPoolerTest.ExactOutput (3270 ms)

So apparently our C++-fu is not yet 100% deterministically secure. @dkeeney how did you pin-point the stuff in your former c++ SP investigations?

dkeeney commented 4 years ago

how did you pin-point the stuff in your former c++ SP investigations?

With a great deal of difficulty...

I first located the point (the cycle number) when they started to diverge. Then I inserted debug trace statements to try and identify the function in which they diverged in that cycle number (actually it turned out to be in the previous cycle). Lots of debug trace statements into a log file and performed a diff. Then I had some good luck in that I hit on the cause.

breznak commented 4 years ago

hmm.. this will be pain, could be a bug in libstdc, compiler,... I'm not sure how far do we want to pursue the multiplatform deterministic builds (well, identical builds, we could just have "deterministic results per platform") I suspect the problem in glibc/musl C, as the same err happens on amd64/arm64 on musl

breznak commented 4 years ago

With #736 all tests (incl. determinism) are passing on CI for MUSL (added its custom results), but the results are not the same for GLIBC and MUSL libc.