sktime / sktime

A unified framework for machine learning with time series
https://www.sktime.net
BSD 3-Clause "New" or "Revised" License
7.75k stars 1.32k forks source link

[ENH] variational mode decomposition transformer #5137

Closed fkiraly closed 8 months ago

fkiraly commented 1 year ago

Variational mode decomposition is a popular transformer in signal processing and neural information processing.

@danelyttinen has made a PR that adds it here https://github.com/sktime/sktime/pull/5129, by interfacing the vmdpy package.

This issue is to discuss how we best add VMD as a method.

One option is interfacing vmdpy, though the last release was 2020, and it is unclear whether it is still maintained.

The package seems abandoned by its original developer @vrcarva (is it? update appreciated), although there's an active issue tracker which highlights interest in this functionality.

I'm pinging anyone who's recently posted there and see how we should best proceed - obviously, @vrcarva's opinion would be most appreciated.

Ping @oswinguai, @sat82sahu, @cai-chengyi, @0wenwu, @soso-maitha, @jiafuei, @evelign, @jeffersonchi, @chooron, @deace, @zhangzhaoweii, @jhogg11, @zxtplayer, @sinvaldomoreno, @shivaniarbat

How to proceed?

Given that there is the package, no longer maintained (?), and multiple forks with bugfixes and improvements, how should we proceed?

An option would be to start moving things to sktime and maintain it here with those who have made contributions - there seems to be a critical mass of people? If there's enough interest and commitment from the community to help maintaining and possibly extending it.

We have an ownership model, so the algorithm would be owned by its developers. Under the current model, the first contributor owns it - that would be @danelyttinen for PR #5129 - but credit is always given, and it would be no problem to extend this to a larger group of people, possibly including @vrcarva.

Another option would be to leave it in vmdpy and develop it there, although given the last years history that might be unlikely?

any alternatives to vmdpy?

Besides vmdpy and its forks, I couldn't find alternative optionn, but that might also be worth considering.

Further, there's always the possibility of de-novo implementation, which might also be used to try address speed and memory issues from scratch.

Some notes

I note some of you have also written your own packages or forks of vmdpy, I see at least three or four - although @vrcarva has reacted, there have been no updates to the original package with the upgrades and fixes.

There also seems to be a general memory leak and speed issue?

It also seems that @danelyttinen's contribution implements @jhogg1's request from https://github.com/vrcarva/vmdpy/issues/4, as it first decomposes and then forecasts, avoiding time leakage.

vrcarva commented 1 year ago

Hey @fkiraly, thanks for getting in touch.

Unfortunately, I haven't been able to maintain vmdpy and this is unlikely to change in the near future. Moving it to sktime sounds like a good idea. @DaneLyttinen's has made nice contributions and it would be great if he could help with this.

Regarding the other issues, it might be worth it to check if any of these are dealt with Matlab's VMD. There's also an R package, but it seems it's also based on the original toolbox by Zosso.

fkiraly commented 1 year ago

Ok, so what I will do then, @vrcarva, with your permission and endorsement:

Deace commented 1 year ago

I noticed that @jiafuei has released an alternative implementation of VMD that emphasizes lower memory consumption and addresses issues with handling odd-length signals. I am wondering if this might be helpful for you. links:https://github.com/jiafuei/vmd-rs

DaneLyttinen commented 1 year ago

Interesting, I do like Rust as well, personally, for its speed and safe operations. But since this is a Python-based framework, I am not sure this would be the way Sktime would like to proceed, however, I do see there is a Python version but still would require the Rustc compiler, from my understanding.

link: https://github.com/jiafuei/vmdrs-py

@fkiraly could you advise your opinion on these alternatives?

fkiraly commented 1 year ago

@fkiraly could you advise your opinion on these alternatives?

Hmmm, that's indeed an interesting question.

What comes closest is the MrSQM classifier that is maintained in an external dependency, due to the C sources in it, to avoid them in sktime proper (that would make build etc much more involved). The logic lives in its own package, mrsqm, and sktime maintains, together with the authors of that package, the interface in sktime, which acts as a mini-package interfacing that package.

See the discussion, with links to further relevant threads, here: https://github.com/sktime/sktime/issues/4338

For vmdrs-py, I see two main options:

  1. vmdrs-py maintains an sktime compatible interface. That would require following the steps in the third party extender guide: https://www.sktime.net/en/latest/developer_guide/add_estimators.html The short version is (a) implement class in vmdrs-py (b) import check_estimator in CI and ensure it runs on the class. This would of course require a CI element where sktime is imported for the checks, but no dependency on sktime of the package.

  2. we could also create an interface to vmdrs-py in sktime and make it searchable. There's no harm in having multiple implementations of the same algorithm, especially if it's a popular algorithm. If we do this, we need to think about how we test rust-based estimators. I assume we would need to set up another CI element for rust-based estimators if we do this?

vrcarva commented 1 year ago

Ok, so what I will do then, @vrcarva, with your permission and endorsement:

  • I will pull vmdpy in its current state into sktime, into this PR [ENH] VMD (variational mode decomposition) transformer based on vmdpy #5129 of @DaneLyttinen, as a separate file in its original form.
  • I will make both @DaneLyttinen and @vrcarva owners of that code and will give credits to you, @vrcarva, and vmdpy, where appropriate. Further maintainers can be added if you agree, that happens by PR.
  • the bugfixes and improvements by the other contributors can then be applied to the vmdpy file after merge, by PR to sktime, to be reviewed by @DaneLyttinen and @vrcarva

This sounds good to me

fkiraly commented 1 year ago

(automatically closed by merging VMD in #5129 - but reopening to continue discussion until completion)

fkiraly commented 8 months ago

I suppose discussion has finished now