Open yarikoptic opened 13 years ago
Matt, it looks like there is some significant 32bit vs. 64bit weirdness in the LDA code. Do you know what it is?
-John
On 06/03/2011 05:06 PM, yarikoptic wrote:
I will email entire log in the email, but here is the relevant excerpt:
RunTests: test 9: stdout OK 12,20c12,20 < -329633465.333333 -329633465.333333 3 3.0 unknown 0.0000 38 < -595057498.000000 -860481530.666667 6 6.0 unknown 0.0000 14 < -613376496.363636 -635359294.400000 11 11.0 unknown 0.0000 32 < -1159652955.090909 -1705929413.818182 22 22.0 unknown 0.0000 2 < -1017516755.340909 -875380555.590909 44 44.0 unknown 0.0000 166 < -1043034866.712644 -1069146422.534884 87 87.0 unknown 0.0000 29 < -879457718.436782 -715880570.160920 174 174.0 unknown 0.0000 17 < -1082324919.413793 -1285192120.390805 348 348.0 unknown 0.0000 2 < -1302478374.419540 -1522631829.425287 696 696.0 unknown 0.0000 143 --- > 10.296875 10.296875 3 3.0 unknown 0.0000 38 > 10.437155 10.577434 6 6.0 unknown 0.0000 14 > 10.347227 10.239314 11 11.0 unknown 0.0000 32 > 10.498632 10.650038 22 22.0 unknown 0.0000 2 > 10.495566 10.492500 44 44.0 unknown 0.0000 166 > 10.469184 10.442189 87 87.0 unknown 0.0000 29 > 10.068007 9.666830 174 174.0 unknown 0.0000 17 > 9.477440 8.886873 348 348.0 unknown 0.0000 2 > 9.020482 8.563524 696 696.0 unknown 0.0000 143 26c26 < average loss = -1.275e+09 --- > average loss = 8.804 RunTests: test 9: FAILED: stderr(stderr.tmp) != ref(train-sets/ref/wiki1K.stderr):
seems to happen only when building on 32bit, ok 64bit looks fine:
RunTests: test 9: stdout OK 18c18 < 10.068007 9.666831 174 174.0 unknown 0.0000 17 --- > 10.068007 9.666830 174 174.0 unknown 0.0000 17 RunTests: test 9: minor (<0.0001) precision differences ignored RunTests: test 9: stderr OK
Updated.
Minor differences in floating point numbers should be expected, because we use -ffast-math when compiling.
-John
On Wed, Jun 29, 2011 at 7:45 PM, Matt Hoffman mdhoffma@cs.princeton.eduwrote:
Figured it out. (Sorry for the delay.)
In line 367 of lda_core.cc, we need to change float kl = -global.lda_mylgamma(global.lda_alpha); to float kl = -(global.lda_mylgamma(global.lda_alpha));
The issue seems to be that on 32-bit machines negating an unsigned int causes behavior that resembles underflow. Why this doesn't give us trouble on 64-bit machines is a bit of a puzzle, but one I'm basically happy to leave a mystery.
The test output still isn't identical to wiki1K.stderr, but it looks reasonable. (This happens on 64-bit machines too.) I'll look into what's going on there.
Matt
On Fri, Jun 3, 2011 at 5:52 PM, John Langford jl@hunch.net wrote:
Matt, it looks like there is some significant 32bit vs. 64bit weirdness in the LDA code. Do you know what it is?
-John
On 06/03/2011 05:06 PM, yarikoptic wrote:
I will email entire log in the email, but here is the relevant excerpt:
RunTests: test 9: stdout OK 12,20c12,20 < -329633465.333333 -329633465.333333 3 3.0 unknown 0.0000 38 < -595057498.000000 -860481530.666667 6 6.0 unknown 0.0000 14 < -613376496.363636 -635359294.400000 11 11.0 unknown 0.0000 32 < -1159652955.090909 -1705929413.818182 22 22.0 unknown 0.0000 2 < -1017516755.340909 -875380555.590909 44 44.0 unknown 0.0000 166 < -1043034866.712644 -1069146422.534884 87 87.0 unknown 0.0000 29 < -879457718.436782 -715880570.160920 174 174.0 unknown 0.0000 17 < -1082324919.413793 -1285192120.390805 348 348.0 unknown 0.0000 2 < -1302478374.419540 -1522631829.425287 696 696.0 unknown 0.0000 143 --- > > 10.296875 10.296875 3 3.0 unknown 0.0000 38 > 10.437155 10.577434 6 6.0 unknown 0.0000 14 > 10.347227 10.239314 11 11.0 unknown 0.0000 32 > 10.498632 10.650038 22 22.0 unknown 0.0000 2 > 10.495566 10.492500 44 44.0 unknown 0.0000 166 > 10.469184 10.442189 87 87.0 unknown 0.0000 29 > 10.068007 9.666830 174 174.0 unknown 0.0000 17 > 9.477440 8.886873 348 348.0 unknown 0.0000 2 > 9.020482 8.563524 696 696.0 unknown 0.0000 143 26c26 < average loss = -1.275e+09 --- > > average loss = 8.804 RunTests: test 9: FAILED: stderr(stderr.tmp) != ref(train-sets/ref/wiki1K.stderr):
seems to happen only when building on 32bit, ok 64bit looks fine:
RunTests: test 9: stdout OK 18c18 < 10.068007 9.666831 174 174.0 unknown 0.0000 17 --- > > 10.068007 9.666830 174 174.0 unknown 0.0000 17 RunTests: test 9: minor (<0.0001) precision differences ignored RunTests: test 9: stderr OK
I will email entire log in the email, but here is the relevant excerpt:
seems to happen only when building on 32bit, ok 64bit looks fine: