Open NicMcPhee opened 1 year ago
I came back to this while playing with implementations of two new PushGP problems (Smallest and Median), where it was often the case that while the individual errors on individual training cases fit in a i64
, adding up 100 of them wraps around, and I sometimes ended up with negative total errors (when all errors should be ≥0).
This is bad on several levels, including the fact that selectors that depend on total error (like Best
and Tournament
) then "like" these (very bad) individuals and select them enthusiastically. It doesn't affect Lexicase
(yay!), but since I use Best
to pick one to display at each generation that display isn't super helpful.
As mentioned in the initial issue, the problem is that we use .sum()
to add up all the errors in several places in test_results.rs
, and .sum()
wraps. Saturating seems to be the right answer, but .sum()
isn't supported on Saturating
types. There have been discussions in the Rust world about implementing Sum
for Saturating
types, but they appear to have gotten stuck in questions of how to handle short-circuiting (which probably isn't a terribly important issue for us).[^1]
[^1]: There's also an interesting question of how to handle things like 1i64 + i64::MAX + (-2i64)
. The OP on the linked thread is really only concerned about unsigned types, so this doesn't come up for them. As we've used Score
and Error
so far they are in fact effectively unsigned types, but I've used signed types like i64
to hold them. This might suggest that this wasn't a great choice? Maybe there's another issue to be created here?
I'm not sure how to best handle this? At the moment I have a hack in my version of Smallest that simply checks if the total is negative and changes it to ::MAX
if it is. This isn't great, though, because (a) it would have to be done on a problem-by-problem basis and (b) it's (dimly) possible that we could wrap so far we end up positive again and a check like this wouldn't be able to tell.
As suggested in the initial issue, one solution might be to decouple the type of a single error and the type for the sum of all the errors. That way we could have a "larger" type for the sum (like i128
or apint
) that will hopefully guarantee that we won't have this wrapping problem.
Another approach would be to implement some sort of saturating sum for error types since after some point we probably don't care how huge errors are (assuming there are individuals with less huge errors in the population).
These notes are based on #185, which I closed as a duplicate of this.
If we want to decouple the types, @JustusFluegel pointed out on Discord that we could default to the types being the same since that might often be the case. So something like TestResults<R, Total = R>
.
It will be interesting to see how badly this disrupts the rest of the code. A lot depends on TestResults
, but providing a default value for the Total
generic might reduce the spread the of change.
We could have a custom implementation of the Sum
for the appropriate type that would use saturating addition instead of wrapping.
If a set of individual errors is large enough, their sum can currently overflow because
.sum()
doesn't do checked addition. Some options includei128
for the total error if the individual errors arei64
I kinda like the last one, although I suspect using an
i128
would also work, at least fori64
errors. If we wantedi128
errors, though,apint
would be the better choice.