`sktime` time series anomalies/changepoint interface upgrade, moving `skchange` to 2nd or 1st party?

fkiraly commented 5 months ago

Great package!

I was pointed to this by one of your collaborators, and wanted to let you know that we are currently reworking the anomalies/changepoints interface - the API had some inconsistencies for a while and we are planning to move it to a more mature state over the next minor release cycles.

It would be great to pool ideas and perhaps work on this together!

@alex-jg3 (sktime core developer) is currently driving the rework.

Relevant issues:

base class rework long thread: https://github.com/sktime/sktime/issues/3214
wishlist for annotation/changepoint algorithms: https://github.com/sktime/sktime/issues/6481 - we are populating this at the moment
handling post-0.30.0 interface changes: https://www.sktime.net/en/stable/changelog.html#core-interface-changes

Input and feedback on the interface designs are much appreciated! Criticism especially, as you are a "consumer" of the interface.

Given the current state of skchange, we could also help:

transition skchange from 3rd party to a 2nd party library, with synced API checks, CI, and indexing of the algorithms through the sktime index, while retaining it as a separate library.
transition to a 1st party, i.e., moving estimators to sktime proper with ownership and maintenance assigned to authors.

What do you think? FYI @alex-jg3

Tveten commented 5 months ago

Hey @fkiraly!

I have actually been meaning to contact you about the points you raise, so it's great that you found your way here on your own through my collaborator.

First, I'd be happy to help on the rework of the annotator/anomalies/changepoints/segmentation interface. What is the best way for me to contribute? Is it mainly by following and contributing to the issues you mention? And get started by following the instructions here https://www.sktime.net/en/latest/get_involved/contributing.html?

I've also been thinking about whether skchange is best suited as a separate library or as a 1st party library down the line (in case you were interested). I currently enjoy the freedom that comes with creating a separate package, especially since it's still at an experimental stage, but I see the advantages of full integreation, definitely. Let me think about it.

Meanwhile, could you elaborate what making skchange a 2nd party library means? What needs to be done in sktime and skchange to make that happen? I know you have adapted several separate packages, e.g. tsfresh, to the sktime framework. Is that an example of a 2nd party integration?

fkiraly commented 5 months ago

Thanks for the reply, @Tveten!

First, I'd be happy to help on the rework of the annotator/anomalies/changepoints/segmentation interface. What is the best way for me to contribute?

Normally, a looser contribution pattern such as via issues etc would be what I would recommend, but the annotation module is "special" in the sense that it has been in an experimental stage for a while and we are moving it to a maturing stage, consolidating interfaces, so it is earlier stage than other modules.

Given this and your dependencies, I think it might be best if we have regular touch points, with @alex-jg3 and possibly users of skchange.

Concretely, I suggest:

join the sktime discord, these are our primary channels for async developer communication: https://discord.com/invite/54ACzaFsn7. There is a dedicated channel in "workstreams", but "dev-chat" is also frequently used.
consider participating in synchronous meetings, I would suggest at least one to coordinate, and try to set upa tighter asynchronous pattern using discord.

For synchronous, we have sync meetings on Mondays 12 UTC (workstream tech meetings) and Fridays 13 UTC (meetups, content varies, presentations or tech planning). Serendipitously, there is also the upcoming roadmap planning meeting 2024-2025 on June 21, i.e., very soon! It might make sense for you to join and suggest priority roadmap items based on your use cases and user base.

On the discord, we can send quick messages to schedule if any other timings are needed.

fkiraly commented 5 months ago

I currently enjoy the freedom that comes with creating a separate package, especially since it's still at an experimental stage, but I see the advantages of full integreation, definitely. Let me think about it.

I would say it's up to you, if you feel you have the capacity to maintain. As you say, the trade-off is between maintenance burden and agility. In any model, we are happy to help. As said, the annotation module is less consolidated than others, so it would be important imo to coordinate, especially when it concerns framework architecture and roadmap.

Meanwhile, could you elaborate what making skchange a 2nd party library means? What needs to be done in sktime and skchange to make that happen? I know you have adapted several separate packages, e.g. tsfresh, to the sktime framework. Is that an example of a 2nd party integration?

Let me be a bit more precise on what I mean with the different models.

The below applies for packages usable via sktime API by other users, i.e., packages or algorithms that are meant to have their own users as opposed to being consumers/users themselves of sktime and algorithms therein.

1st party: contribution is directly to sktime and packages maintained by the sktime team. Governance and maintenance framework is the same, that of sktime. Code-wise, estimators are part of sktime repository, and subject to sktime CI.
2nd party: there is a separate package, e.g., skchange, under separate governance and maintenance, licenses and owners may differ. There is technical integration and active cooperation both ways, e.g., PR to each others' packages. Communication and development is integrated via interfaces, e.g., following dev announcements on discord. Code-wise, estimators live in a separate repository, and CI runs interface conformance tests. Recent examples are prophetverse https://github.com/felipeangelimvieira/prophetverse, which follows the sktime forecaster API, or tsbootstrap https://github.com/astrogilda/tsbootstrap, which follows the scikit-base API while introducing its own base class interface that sktime adapts to transformers.
- later it can be decided to transition to 1st party (bilaterally), or 3rd party (unilaterally), e.g., if maintenance capacities change
3rd party: there is a separate package but no active collaboration. The sktime interface of the 3rd party package is maintained unilaterally, by sktime, and the 3rd party package typically does not make efforts to be interface compatible with sktime unified APIs. Examples of this would be the tsfresh package your mention, or statsmodels, because neither projects have a commitment to compatible interfaces, share development processes, or commit resources to the integration.
- in-principle there is a variant of this where the 3rd party package provides sktime compatible interfaces without active collaboration. We know this happens a lot in closed code bases (where IP management is the obstacle to collaboration), but we have not seen this in open code bases. Code-wise this works by using the contract checking utilities.

fkiraly commented 5 months ago

There are also some typical transitions we've seen over years:

3rd -> 2nd. In a case where the sktime maintained interface is substantially more convenient or accessible (e.g., parameter choice, documentation), the 3rd party may choose to "adopt" the sktime interface and become its maintainer, while retaining a separate package. The typical case would be research grade software or academic software.
2nd -> 1st or 3rd -> 1st. The typical case is where maintenance for the software ceases, e.g., due to small maintainer team and career changes, or corporate projects getting cancelled. In this case a 2nd party maintainer may choose to move the package to sktime (repo or as small onboard lib), or sktime devs may choose to fork an abandoned package into sktime to ensure continued maintenance. An example of this is pykalman, you can read up on the saga here: https://github.com/sktime/sktime/issues/5414, this is a fork made by sktime; an example of collaborative transfer is vmdpy.
2nd -> 3rd. The typical case is a startup in the time series space that uses sktime as an initial platform to gain notoriety and a user base, then divests its development resources from maintaining the sktime interface in favour of its own package or software, which may be open or closed. Examples are interfaces to statsforecast and temporian, both run by companies who initially helped maintain an integration and no longer do.

Tveten commented 5 months ago

Thanks for the thorough reply!

I think a 2nd party solution sounds like a good solution for now, then we can see what happens down the line. So a few concrete steps towards this is:

skchange already runs interface conformance tests, I believe: https://github.com/NorskRegnesentral/skchange/blob/main/skchange/tests/test_all_detectors.py. So maybe not much to do on the skchange part of the code?
Make skchange adaptors in sktime.
I contribute to the ongoing annotator discussions on Github and Discord.
Conform skchange to the new annotator interface.

Something like that? I'm very open to additional suggestions!

fkiraly commented 5 months ago

@Tveten, makes absolute sense.

I would suggest as an additional point, what would be very useful is joining the discussion and/or active development on improvements to the 2nd party developer experience, together with @felipeangelimvieira (prophetverse).

This PR came out of that: https://github.com/sktime/sktime/pull/6588 so now you can use check_estimator, and pytest will run the individual tests separately.

I will open an issue with some of the current improvements around this, and also "indexing", i.e., discoverability via all_estimators and the estimator search GUI on the web: https://www.sktime.net/en/latest/estimator_overview.html

Tveten commented 5 months ago

@fkiraly Fantastic!

fkiraly commented 5 months ago

Here: https://github.com/sktime/sktime/issues/6639

fkiraly commented 3 months ago

FYI, we've now refactored the annotation module tests so they can be used in 3rd and 2nd party packages via check_estimator and parametrize_with_checks, in this PR: https://github.com/sktime/sktime/pull/6843

The change should be available in the next release this week.

What would be great if we could jointly progress API conformance, and also widen the interface so it allows things you want, such as multivariate, panel, etc.

Are you back from holiday?

Tveten commented 3 months ago

Great! I'm back as of today. I'll start working on the API conformance this week.

fkiraly commented 3 months ago

excellent!

The parametrize_with_checks utility should now also work correctly for time series annotators (outliers, changepoints), with the release 0.31.1 a few days ago it connects to the respective test class.

Meanwhile, I'll also be working on some of the suggestions you and others made in the annotation design module, e.g., multivariate, use in pipelines as transformations, etc.

Tveten commented 3 months ago

Hey @fkiraly!

I have now attempted conforming to the new annotator API. I honestly think it was pretty hard. It took me quite a long time to understand the design and what the new requirements for me as an extender are. Which methods do I have to implement and which are optional? How do all the new methods work together? What restrictions do they put on the output types of different detectors' predict method? What do I have to do to add support for a new task like collective anomaly detection?

After studying the new class more, I have the following comments:

I think the current BaseSeriesAnnotator is too massive and complex. Mainly because it mixes all types of tasks and their utility functions (_sparse_points_to_dense, _sparse_segments_to_dense, change_points_to_segments etc.) together in one big class. This makes it hard to understand what functionality depends on what and which format restrictions apply where. For readability, maintainability and ease of extension, I think the different tasks should be separated more clearly.
I think the predict_points() and predict_segments() methods are not necessary. At least at this point in the development process. I'd say other core functionality should be nailed first.

That being said, I like the overall design regards getting rid of the fmt and labels arguments, and using predict for sparse output and transform for dense output. I think the design is on the right track!

Due to the difficulties I had, I ended up coding up my own suggestion of a base class, which you can find here: https://github.com/NorskRegnesentral/skchange/blob/new_detector_base_class/skchange/base.py I called it BaseDetector, but think of it as version of BaseSeriesAnnotator. As long as the BaseSeriesAnnotator is in an experimental state, I think it is safer for skchange to have its own base class, but I want it to be as close to BaseSeriesAnnotator in spirit as possible. When BaseSeriesAnnotator has matured sufficiently, I could make it depend on BaseSeriesAnnotator again.

Here are some of the main differences between BaseDetector and BaseSeriesAnnotator:

BaseDetector is as lightweight as possible.

The main methods are:

    fitting                         - fit(self, X, Y=None)
    detecting, sparse format        - predict(self, X)
    detecting, dense format         - transform(self, X)
    detection scores, dense         - score_transform(self, X)  [optional]
    updating (temporal)             - update(self, X, Y=None)  [optional]

In addition there are two abstract converter methods:

    converting sparse output to dense - sparse_to_dense(y_sparse, index)
    converting dense output to sparse - dense_to_sparse(y_dense)  [optional]

where .transform(X) = sparse_to_dense(self.predict(X)). So for a minimal version of a detector to be implemented, the required methods to implement are _fit, _predict and sparse_to_dense.

Regarding the naming "score_transform" vs. "transform_scores": I think detector.score_transform is closer to the natural language "apply the detector's score transform to data X", while transform_scores means "_scores" in a subscript sense. This is a minor point of personal preference, and I really have no problem using transform_scores if that makes conformance with sktime easier.

Common detector types = subclasses of BaseDetector

For common detector types like anomaly detectors or changepoint detectors, subclasses of the form

class PointAnomalyDetector(BaseDetector)
    def sparse_to_dense(y_sparse, index)
    def dense_to_sparse(y_dense)

define the output formats of .predict and .transform. This fully separates point anomaly detectors from changepoint detectors and any other detector types. This improves readability and maintainability in my opinion. For full examples, see:

Change detection: https://github.com/NorskRegnesentral/skchange/blob/new_detector_base_class/skchange/change_detectors/base.py
Anomaly detection: https://github.com/NorskRegnesentral/skchange/blob/new_detector_base_class/skchange/anomaly_detectors/base.py

A new anomaly detector can then inherit from PointAnomalyDetector, and only need to implement _fit and _predict.

Adding support for a new generic detector -- collective anomalies in my case -- is also simply by following the same recipe; make a class CollectiveAnomalyDetector with sparse_to_dense and dense_to_sparse.

A new exotic detector can also be added by inheriting directly from BaseDetector and implementing _fit, _predict and sparse_to_dense (and other methods optionally).

Output formats

For clarity and maintainability, I have made the decision to decrease the number of options of output types. At least for now. On a high level:

The aim of predict is to give output in the sparsest possible form. This means that a changepoint detector outputs a series of changepoint locations, not intervals with both the start and end of segments, as the latter contains superfluous information.
The transform method on the other hand, I think can be more generous with the amount of information it outputs. Therefore I have decided to make the default transform output of changepoint detectors the segment labels, and not an indicator of the changepoint locations. I think the segment labels are much more useful than an indicator of their locations. I believe these kinds of prioritisations are helpful to not make the classes overly complex, at least until the minimal functionality has been identified and grown mature.
Changepoints are defined as the last element in a segment. This is by far most common in the statistical literature. I don't think it makes sense to allow for per-method flexibility on this definition. It will make the code unnecessarily complicated.
For the dense output of anomaly detectors's transform 0 is used to represent normal instances, while >0 entries are anomalies. Point anomalies only use 1s to represent anomalies, while collective anomalies are labelled by integers from 1, ..., K per anomalous interval.
For the remaining details, see the code :)

Finally

Sorry for the long read and that I just went ahead and made my separate implementation rather than conforming. I hope some of the ideas are useful for the BaseAnnotator design.

All feedback is very welcome!

fkiraly commented 2 months ago

@Tveten, I had a look at your proposal, interesting!

I very much like that it is leaner! This is an idea we have not explored - a hierarchy of base classes, from which primarily change point detectors and

though I see a key issue regarding stability:

you inherit from BaseTransformer, this feels inherently dangerous, as you are inheriting some of its private logic as well - how are you certain this will not break if private logic in BaseTransformer changes

Further, can I ask for an explanation: how do you account for the case where the user may want to use the same object to return a segmentation vs the changepoints?

Tveten commented 2 months ago

Fix in both sktime and skchange to let the dependency of BaseDetector be on BaseEstimator:

Tveten commented 2 months ago

Just to make sure we speak about the same thing:

changepoints = list-like of changepoint locations
segmentation = list-like of intervals with labels, where the labels can reoccur for different intervals and the intervals are non-overlapping and cover the input data's index.

With this definition, a list of changepoints is equivalent to a segmentation where the inclusive end-point of the intervals is the changepoint and each interval has a unique label. However, the changepoint representation is sparser when applicable.

Since all the methods in skchange are changepoint detectors and doesn't do any grouping of the resulting segments, I have chosen to drop the interval-based segmentation representation for now. Just to keep things as simple as possible. However, as kind of a middle ground, the transform method of changepoint detectors return segment labels. The transform of changepoint detectors could also have returned a dense 0-1-indicator of changepoint locations, but I think the segment labels are much more useful.

Currently my design thinking is that .predict should be "as sparse a representaton as possible", while .transform should be as "dense as possible" in the sense of containing a lot of information. This is also why I have chosen to give each collective anomaly separate labels, rather than labelling them all as "1" like point anomaly detectors. For example.

If you really want the interval-based segmentation at some point, and maybe it's more relevant for the annotators in sktime, I think the best solution would be to make a BaseSegmentor or something that defines the output format. Then each concrete detector needs to decide whether it fits best as ChangepointDetector or BaseSegmentor, where I would always go with the sparsest choice. I guess you could also make an adaptor that converts any ChangepointDetector into a Segmentor.

fkiraly commented 2 months ago

Currently my design thinking is that .predict should be "as sparse a representaton as possible", while .transform should be as "dense as possible" in the sense of containing a lot of information.

Agreed with the first, for the second I'd specifically say it should be the "most reasonable expectation" for a return when the condition is for that return to be a univariate time series and have the same index as the input.

Just to make sure we speak about the same thing: changepoints = list-like of changepoint locations segmentation = list-like of intervals with labels, where the labels can reoccur for different intervals and the intervals are non-overlapping and cover the input data's index.

Yes, segmentations can also be label-less or overlapping though. If you get a segmentation from consecutive points, they will always cover the range from first to past point exhaustively and non-overlapping, but in my conceptual model that is not a necessary property for general segmentations.

If you really want the interval-based segmentation at some point, and maybe it's more relevant for the annotators in sktime, I think the best solution would be to make a BaseSegmentor or something that defines the output format. Then each concrete detector needs to decide whether it fits best as ChangepointDetector or BaseSegmentor, where I would always go with the sparsest choice. I guess you could also make an adaptor that converts any ChangepointDetector into a Segmentor.

Yes, of course, this is sth I've been thinking about, but I recoil from this solution for two reasons:

avoiding proliferation of base classes
some types of algorithms can do multiple of these natively - as you say, a change point detector is naturally a segmenter, and many scientific papers seem to use these interchangeably, or conceptualize change point detectors as segmenters primarily even.

The situation also reminds me where we had different API designs for univariate, multivariate forecasters; or different base classes even for single-time-series transformations and panel (collection-of-) time series transformations, that did not age well as it led to many adapters that turned out unnecessary, and case distinctions in composites.

Tveten commented 2 months ago

I see all your points.

Regarding the point-based ChangepointDetector vs. interval-based Segmentor, I guess the choice/trade-off is:

More base classes to separate different tasks/concerns entirely.
One base class, but mix more tasks/concerns together in one big base class.

From the skchange perspective, the number of different tasks isn't that big, so I don't see an issue with one base class per task. At least yet.

From the sktime perspective, where the annotator is slightly more general, I can see how it might be better to fit all the tasks into a single base class. The cost is to keep track of all the private "adaptor" methods like _sparse_points_to_dense, _sparse_segments_to_dense etc., and which restrictions they put on the output types in the end. From a developer/extender perspective, I found this quite involved to keep track of, but maybe I'm overcomplicating and simply like to keep matters isolated.

One base class split I would consider though, is to distinguish anomaly from change detection. It could turn out hard to maintain a design where the output format of point/collective anomaly detectors could influence the output format of changepoint detectors or segmentors and vice versa. It is just simpler for me to think about what the output format of change detectors or segmentors should be, without taking account of what the format for anomaly detectors should be simultaneously.

fkiraly commented 1 month ago

Here is another "compromise" option:

there could be a joint base class for all detectors, but multiple sub-base-classes with clear extension pattern. That is, there would be two or three extension templates depending on the subtasks, e.g., changepoint, anomaly, segment.

Tveten commented 1 month ago

Isn't that what I've implemented now? Without the explicit extension templates.

fkiraly commented 2 days ago

yes, indeed. And we have since then decided to move to joint base class, see https://github.com/sktime/sktime/issues/7323 - so this issue is closed very nicely!

NorskRegnesentral / skchange