Closed Elucidation closed 1 year ago
Using np.short at 2 bytes, we are down from 290kb to 145kb, can we go smaller and single byte?
Building true heuristic
Built true heuristic grid in 579.28 ms
Dict with 116 keys,
Size: 151,264 bytes, one entry: 1248 dtype = int16
NxM = 26x24 = 624
Expected size ~NxMxZ = 26x24x116 = 72,384 values
For int16 = 2 bytes, so 2 bytes * 72,384 = 144,768 bytes
Measured 151264 vs 144768, diff = 104.49 %
Using np.int8 we are down to 1 byte
orld Shape: (26, 24), 44 robots, 48 item zones, 24 stations
Building true heuristic
Built true heuristic grid in 590.16 ms
Dict with 116 keys,
Size: 78,880 bytes, one entry: 624 dtype = int8
NxM = 26x24 = 624
Expected size ~NxMxZ = 26x24x116 = 72,384 values
For int8 = 1 bytes, so 1 bytes * 72,384 = 72,384 bytes
Measured 78880 vs 72384, diff = 108.97 %
And we've reached ~72kb from our previous 290kb, 4x savings
Strangely, using int8 seems to both slow down generation of the heuristic (by multiples!) and when running queries. Possibly due to numpy BLAS implementation
With np.int32
World Shape: (26, 24), 44 robots, 48 item zones, 24 stations
Building true heuristic
Built true heuristic grid in 197.99 ms
Dict with 116 keys,
Size: 296,032 bytes, one entry: 2496 dtype = int32
NxM = 26x24 = 624
Expected size ~NxMxZ = 26x24x116 = 72,384 values
For int32 = 4 bytes, so 4 bytes * 72,384 = 289,536 bytes
Measured 296032 vs 289536, diff = 102.24 %
---
Testing true heuristic grid on 1000000 queries
Did 1000000 queries into true heuristic grid in 278.76 ms
With np.int8
World Shape: (26, 24), 44 robots, 48 item zones, 24 stations
Building true heuristic
Built true heuristic grid in 569.73 ms
Dict with 116 keys,
Size: 78,880 bytes, one entry: 624 dtype = int8
NxM = 26x24 = 624
Expected size ~NxMxZ = 26x24x116 = 72,384 values
For int8 = 1 bytes, so 1 bytes * 72,384 = 72,384 bytes
Measured 78880 vs 72384, diff = 108.97 %
---
Testing true heuristic grid on 1000000 queries
Did 1000000 queries into true heuristic grid in 471.28 ms
Also realizing even with uint8 a 255 max distance path in a grid limits our grid size to a much smaller number than is allowed, so we can't use int8 anyway.
Also noticing google cloud instance cOS default python int is int64 (8 bytes), whereas on dev machine windows its int32 (4 bytes)
[REVERTED] Premise was wrong: dist grids are integers increasing to max dist
worst case example 200x200 grid max val is 40,000, so need int32 at least (int16 is -32 to 32k).
This does lead to #98 however
Default heuristic grid size is int32 for each entry. Since we're using values of in
range -1-4we don't need a 32-bit int, a signed short should be more than enough.