Closed tommedema closed 2 years ago
A specific method would be easy yes. For now, you can use the faster c version by using:
sa = subsequence_search(q, trainingSeries, dists_options={'use_c': True})
In the master branch there is now also a kbest_matches_fast
function (functionality is identical to passing dist_options
). Will be part of the next release.
@wannesm that dists_options
flag would be perfect (I had tried that actually) but I get this TypeError when I do so (running dtaidistance v2.3.9):
sa = subsequence_search(q, trainingSeries, dists_options={'use_c': True})
TypeError: subsequence_search() got an unexpected keyword argument 'dists_options'
TypeError Traceback (most recent call last)
/var/folders/lm/xhqw06kd341ck445l0rpz_cr0000gn/T/ipykernel_73353/797742654.py in <module>
33 query = queries[0:trainingQueryWindow]['rl0'].groupby([sep['ticker'], sep['dl0']]).head(trainingQueryWindow)[::-1].to_numpy(dtype = float)
34
---> 35 get_ipython().run_line_magic('lprun', '-f getQueryParameters getQueryParameters(query)')
36 # getQueryParameters(query)
~/opt/anaconda3/lib/python3.9/site-packages/IPython/core/interactiveshell.py in run_line_magic(self, magic_name, line, _stack_depth)
2349 kwargs['local_ns'] = self.get_local_scope(stack_depth)
2350 with self.builtin_trap:
-> 2351 result = fn(*args, **kwargs)
2352 return result
2353
~/opt/anaconda3/lib/python3.9/site-packages/decorator.py in fun(*args, **kw)
230 if not kwsyntax:
231 args, kw = fix(args, kw, sig)
--> 232 return caller(func, *(extras + args), **kw)
233 fun.__name__ = func.__name__
234 fun.__doc__ = func.__doc__
~/opt/anaconda3/lib/python3.9/site-packages/IPython/core/magic.py in <lambda>(f, *a, **k)
185 # but it's overkill for just that one bit of state.
186 def magic_deco(arg):
--> 187 call = lambda f, *a, **k: f(*a, **k)
188
189 if callable(arg):
~/opt/anaconda3/lib/python3.9/site-packages/line_profiler/ipython_extension.py in lprun(self, parameter_s)
102 try:
103 try:
--> 104 profile.runctx(arg_str, global_ns, local_ns)
105 message = ""
106 except SystemExit:
~/opt/anaconda3/lib/python3.9/site-packages/line_profiler/line_profiler.py in runctx(self, cmd, globals, locals)
140 self.enable_by_count()
141 try:
--> 142 exec(cmd, globals, locals)
143 finally:
144 self.disable_by_count()
<string> in <module>
/var/folders/lm/xhqw06kd341ck445l0rpz_cr0000gn/T/ipykernel_73353/797742654.py in getQueryParameters(q)
1 def getQueryParameters(q):
----> 2 sa = subsequence_search(q, trainingSeries, dists_options={'use_c': True})
3
4 best = sa.kbest_matches(k = trainingMaxMatchCount)
5
TypeError: subsequence_search() got an unexpected keyword argument 'dists_options'
Really appreciate your update to master branch.
It seems like somehow my version installed through pip (2.3.9) does not include the dist_options, unlike what is currently on master branch:
Fixed by reinstalling from git: pip install -vvv --upgrade --force-reinstall --no-deps --no-build-isolation --no-binary dtaidistance git+https://github.com/wannesm/dtaidistance.git#egg=dtaidistance
I recently switched from a for loop with
dtw.distance_fast
to usingsubsequence_search
.Before:
After:
While this improved speed quite a bit, that was mostly because I added the limit of 100 (where before it was storing the distances of all entries). I looked at the source code for subsequence_search, and noticed that it is not using the C version of dtw.distance. Is it possible to have it run the C version, i.e. is there a
subsequence_search_fast
?Currently using lprun I can see that 99% of the time is spent on the
sa.kbest_matches
invocation.