Closed turmeric-blend closed 3 years ago
Hi @turmeric-blend, good question. Yes, the multivariate implementation is quite a lot slower than the univariate implementation. (This is partly why, for now at least, they are separate, I have also separated MiniRocket and MiniRocketMultivariate in sktime.)
I am aware of the problem. Unfortunately, I haven't had the chance yet to work out whether there is a straightforward way to 'fix' it. There may be a way to avoid whatever is causing the problem by rearranging the oprerations a bit, or it might be that there is some issue with how numba handles the additional dimension.
Basically, if you have univariate data, just use the univaraite implementation. If you have multivariate data, unfortunately at the moment the speed issue is unavoidable.
There is a GPU implementation coming (this is not my work), which will most likely be a lot faster for most multivariate datasets, at least until I can work out how to fix this issue with the CPU implementation.
ok, thanks for letting me know.
My setup is that I am using large dataset (10,000+) and I pass data as batches into model. I do not cache the data and run
transform
every time I pass data into model on every epoch. I run this same setup for bothminirocket.py
with input shape(32768,99)
andminirocket_multivariate.py
with input shape(32768,1,99)
so the number of channel is 1.I find that the
minirocket_multivariate.py
version runs significantly more slow on everytransform()
relative tominirocket.py
.Is there a potential bug in the code?