Closed Asa-Hopkins closed 2 years ago
Very interesting! It will take me a while to verify the results, but it looks like you gave it careful attention. Thanks for the nice improvement.
As written, line 28 is producing a NaN, when dist or anglePosCos is zero, for me: var contrib = 1.0f / dist * anglePosCos;
This also happens, when maxSumMag, line 34, is zero.
Note: I think this only happens when populationSensorRadius is zero. Maybe we need another assert to guard against this?
With the following two tests, the proposed changes give the same results as the old code:
Grid is 5x5, with a population sensor radius of 2.0f 5 Indivs at { (1,0), (1,1), (1,2), (1,3), (1,4) } test1: grid.GetPopulationDensityAlongAxis( Coord(2,2), Compass.W ) = 0.5833333f test2: grid.GetPopulationDensityAlongAxis( Coord(2,2), Compass.N ) = 0.5f
As written, line 28 is producing a NaN, when dist or anglePosCos is zero, for me: var contrib = 1.0f / dist * anglePosCos;
Doesn't this PR remove this line? And also both of those variables?
Yes, this PR removes both variables, and the NaN they may produce.
Though, the new code will now produce a NaN, at lines 25 and 26, when we pass the CENTER direction into it. This may happen from two places in getSensor.cpp, line 296 and 300: sensorVal = getPopulationDensityAlongAxis(loc, lastMoveDir); sensorVal = getPopulationDensityAlongAxis(loc, lastMoveDir.rotate90DegCW());
And both versions still have a problem with a zero population sensor radius, which we can enforce at another level.
You're right about it returning NaN when d=4
, I had overlooked that. The behaviour of raySameness
is to return 1.0
in this case, I'll get this fixed so it effectively does the same.
As far as I can tell, this situation never actually happens, I think that lastMoveDir
is only updated when movement actually occurs (as it's updated in Peeps::drainMoveQueue()
), meaning it should never be 4.
At initialisation it's set using Dir::random8()
too, but I could see situations where passing d=4
into these functions could be useful so it's definitely worth making it behave well.
The d=4
case should be handled now, as far as I can see there's no avoiding one square root in that case, however the check for which case to use can be done outside the main loop and as far as I can tell has no real impact on performance.
Thanks for the nice optimization and the thought you've put into this.
The proposed way to handle direction == 4 (symbolically Compass::CENTER) looks robust, but I'm wondering if we should handle it at all. We're talking about two functions that calculate some sort of density along an axis. Each function takes a location and a direction to define an axis. If the direction == Compass::CENTER, then there is no defined axis, and the whole concept of density has no meaning. Would it make sense to just say that it's invalid to call these functions with an undefined axis add an assert() to check for it?
That makes sense to me, I was imagining it could be used as a nearby population count when direction == Compass::CENTER
, but I suppose without any information about direction then it wouldn't be very useful.
These two functions both calculate
cos(t)/r
, where t is the angle betweenoffset
anddir
andr
is the length ofoffset
. This method achieves that by calculating a unit vector in the direction ofdir
, which is then accessed byf
without recalculating each call. The dot product of this vector withoffset
is then justcos(t)*r
, since the other vector is of unit length. Findingcos(t)/r
then just requires dividing byr^2
.On my machine, this reduces the time spend in these functions by around 40%, and this speeds up the entire program by around 25% (with video encoding disabled and other settings default), but obviously this is dependent on how often these signals are being used.
The output of this function is within 10^(-8) of the output of the original, with the difference coming largely from the fact that
raySameness
only works to float precision, whereas it seems intended for this function to work to double precision.The comments mention that this function uses the "positive absolute cosine", which seems to imply that
|cos(t)|
should be being used instead. Doing that fails to reproduce the behaviour of the code being replaced so I've left it as is.