Open breznak opened 9 years ago
@chetan51 @oxtopus guys, do you have some tricks in your sleeves? :wink:
Do all these methods reflect a single call stack? Or are they separate branches resulting from a single call to compute?
@breznak What radius are you using? The larger the radius, the longer the encoder takes. Try to keep the radius small by reducing the minimum resolution of your data.
Do all these methods reflect a single call stack? Or are they separate branches resulting from a single call to compute?
@cogmission from a single "task", that is, encode()
gets called ~390.000x
What radius are you using? The larger the radius, the longer the encoder takes. Try to keep the radius small by reducing the minimum resolution of your data.
@chetan51 the data is 2 natural numbers from 1..128 interval, I want to distinguish each 2 numbers. Setting range=0
(does the value make sense?) reduces the time already to 54s!
Thank you both for help!
@breznak When you say range, do you mean radius?
@chetan51 sorry for thread-zombie, I've missed that reply.
When you say range, do you mean radius?
yes.
Still, I can squeeze about 50% time when I prehash the values, instead of computing them interactively. Do you think we should speed up _hashCoordinate()
by using internal array _hash[x][y]
that is random initialized during init? It makes initialization last a while for longer arrays, but then _hashCoordinate is constant time.
Do you think we should speed up _hashCoordinate() by using internal array _hash[x][y] that is random initialized during init?
But how many values would you initialize? The Coordinate Encoder currently works for infinite space.
But how many values would you initialize? The Coordinate Encoder currently works for infinite space.
Ouch, my mistake, in my scenario I use fixed-sized coordinate dimensions.
Maybe worth adding optional param dimensionsSizes[]
? Then the speedup
could be applied.
A cleaner optimization might be to just cache computed coordinates, so we would still get speedups for repeating sequences.
A cleaner optimization might be to just cache computed coordinates, so we would still get speedups for repeating sequences.
Great idea! So even infinite-space coords would benefit, also we'd expect
some repetitive patterns in sequences, so this cache should hit more
likely. And for fixed sized we'd set cacheSize=dim(x)*dim(y)
@chetan51 I've implemented the caching (cache (input,bitvector output) pairs) with very good results!
I'm wondering if this could be generalized and moved all this logic to base
encoder, so we could get the gains through all encoders?
Anyway, this issue is addressed by the PR above already.
@breznak Great! If you can make it a generalized decorator, then we can reuse it across encoders.
hey all, it would be awesome if we could speed up coordinate encoder something like
60x+
:man_with_turban:Use case: I have real world date where
3+s
~600k+ data points
, but just processing with coordinate encoder takes180+s
.Here's the profile:
Limitations: The task has some preconditions, which hopefully could be exploited!
2D
, ranges on axis are small (1..127)What do you think, should I try, or start from somewhere else completely?