Found by Taylor, who writes:
"I realized the other day that the logit function I put in backend/lite/utils.py is not good near 1/2. For inputs between logistic(-1) = e^{-1}/(1 + e^{-1}) and logistic(+1) = 1/(1 + e^{-1}), it should be evaluated using the identity log(1/(1 - p)) = -log((1 - p)/p) = -log(1 + (1 - p)/p - 1) = -log(1 + (1 - p - p)/p) = -log(1 + (1 - 2p)/p) which can be computed safely using the log1p function by -log1p((1 - 2p)/p). The error of evaluating log(1/(1 - p)) directly can be quite large, while the error of evaluating -log1p((1 - 2p)/p) is bounded by 10eps. E.g., for p = .499999999, naive evaluation of log(1/(1 - p)) is wrong in about half the digits. I sketched some proofs and wrote a few more tests here:
(I chose to evaluate logistic(x) by e^x/(1 + e^x) for negative inputs there, but on reflection I see that it doesn't make a difference: for positive or negative inputs, the relative error of 1/(1 + e^{-x}) is bounded by 8eps + 4eps^2, since e^{-x}/(1 + e^{-x}) <= 2, which is exactly the same bound I computed for e^x/(1 + e^x). Maybe I'll just remove that case and reduce the amount of math to verify.)"
Found by Taylor, who writes: "I realized the other day that the logit function I put in backend/lite/utils.py is not good near 1/2. For inputs between logistic(-1) = e^{-1}/(1 + e^{-1}) and logistic(+1) = 1/(1 + e^{-1}), it should be evaluated using the identity log(1/(1 - p)) = -log((1 - p)/p) = -log(1 + (1 - p)/p - 1) = -log(1 + (1 - p - p)/p) = -log(1 + (1 - 2p)/p) which can be computed safely using the log1p function by -log1p((1 - 2p)/p). The error of evaluating log(1/(1 - p)) directly can be quite large, while the error of evaluating -log1p((1 - 2p)/p) is bounded by 10eps. E.g., for p = .499999999, naive evaluation of log(1/(1 - p)) is wrong in about half the digits. I sketched some proofs and wrote a few more tests here:
http://git.savannah.gnu.org/cgit/mit-scheme.git/tree/src/runtime/arith.scm#n1992
http://git.savannah.gnu.org/cgit/mit-scheme.git/tree/tests/runtime/test-arith.scm#n193
(I chose to evaluate logistic(x) by e^x/(1 + e^x) for negative inputs there, but on reflection I see that it doesn't make a difference: for positive or negative inputs, the relative error of 1/(1 + e^{-x}) is bounded by 8eps + 4eps^2, since e^{-x}/(1 + e^{-x}) <= 2, which is exactly the same bound I computed for e^x/(1 + e^x). Maybe I'll just remove that case and reduce the amount of math to verify.)"