Open mloning opened 3 years ago
Hi @mloning ,
Sounds interesting and exciting. We started using sktime for some research tasks and teaching assignments and are satisfied users. So we are willing to provide updates and features to make this process easier. Our current focus is on our own projects (i.e. fast C-versions), and while we try to make it reusable, some functionality might be missing because we didn't yet need it (e.g. we don't have a predict
method for clustering although that's easy to implement).
W.r.t. your questions:
dtaidistance
as a dependency: I don't think there is anything special. We have two important optional dependencies: Cython and Numpy, which you already require (optionally). For clustering we rely on Scipy and PyClustering for some methods (optional and dynamically checked). We also selected a permissive license to make it easy to collaborate.sklearn
or one of our own ML algorithms in the same pipeline. But it is typically communicating only through simple datastructures. For example, we use DTW to compute distances to prototypes and use the results as features. Or we feed it to scipy
, pyclustering
or our own clustering algorithms (e.g. for fleet-based anomaly detection). Currently we use agglomerative clustering (using own implementation or scipy), medoid clustering (using pyclustering) and, since this month, DBA-k-means (own implementation). We do indeed follow the sklearn
interface since everybody is already familiar with this api (also inspired by the now moved out sklearn HMM module). Did you have anything else in mind than clustering for classification?Thanks @wannesm, sounds good! @chrisholder will start working on it over the next few weeks, we'll report back if we have more questions!
Hi everyone,
Your package looks really good and we're thinking about interfacing your package in sktime to make use of the time series distance functions you provide, and potentially also the clustering algorithms (see https://github.com/alan-turing-institute/sktime/issues/501).
dtaidistance
as a dependency?sklearn
(I see that you largely follow their interface)?fit
andpredict
, 2d numpy array assuming multiple instances of equal-length univariate data?cc @chrisholder