JuliaDynamics / TimeseriesSurrogates.jl

A Julia package for generating timeseries surrogates
https://juliadynamics.github.io/TimeseriesSurrogates.jl/stable/
Other
46 stars 8 forks source link

Surrogates for irregular time series #12

Open felixcremer opened 5 years ago

felixcremer commented 5 years ago

According to [1] iaaft and aaft are not working for irregular time series, because these methods are based on the Fourier transform. They propose an alternative to it using the LombScargle periodogram. Are surrogates for irregular time series in the scope of this package? Are there algorithms for surrogates in irregular data?

[1] Testing for nonlinearity in unevenly sampled time series Andreas Schmitz and Thomas Schreiber https://journals.aps.org/pre/pdf/10.1103/PhysRevE.59.4044

kahaaga commented 5 years ago

Yes, that is correct. Random shuffle surrogates (which only preserve the amplitude distribution of the data) or random block shuffle surrogates (preserves short-term correlations, but breaks long-term correlations) would work on irregularly spaced data, but anything more complicated than that needs deeper consideration. If using aaft or iaaft, what one would do in practice is interpolate the data to a regular grid before applying them. However, as Schmitz and Schreiber state in the paper you linked, such introduces some bias.

Having an implementation of their method would be a great idea, and definitely within the scope of the package.

From a brief glance of the paper, it doesn't look like too much work to implement. But maybe you already have a rough implementation ready?

felixcremer commented 5 years ago

I don't have an implementation.

kahaaga commented 5 years ago

This certainly looks useful, though. Thanks for bringing the method to my attention! I'm busy with other things at the moment, but might get some time in a few weeks to implement it.

In the meanwhile, I'll keep the issue open, in case someone else wants to chime in.

Datseris commented 4 years ago

I've started using TimeseriesSurrogates.jl and this method is useful for me as well.

felixcremer commented 4 years ago

I am wondering, whether we could provide the surrogate methods which are based on fourier transform for irregular time series by using the non-equdistant fourier transform https://github.com/tknopp/NFFT.jl. This is a version of the fourier transform which would work on irregular time series. If you would think, that this is a good idea, I could try to prepare a PR for that.

Datseris commented 4 years ago

I think conceptually it makes sense. But why not go for the method directly suggested in [1]? Is it easier to use the NFFT?

But in any case, I think this is worth a PR no matter which of the two approaches you follow. Be aware that we are currently changing the API of TimeseriesSurrogates.jl.

kahaaga commented 4 years ago

I am wondering, whether we could provide the surrogate methods which are based on fourier transform for irregular time series by using the non-equdistant fourier transform https://github.com/tknopp/NFFT.jl. This is a version of the fourier transform which would work on irregular time series. If you would think, that this is a good idea, I could try to prepare a PR for that.

Hi Felix,

I also think this is well-worth a PR, regardless of the method in [1]. If you can prepare a working implementation, please go ahead!

Be aware that we are currently changing the API of TimeseriesSurrogates.jl.

No problem, we can just adjust the code to the new API once the implementation is ready!

Datseris commented 4 years ago

Hi @felixcremer , we are planning to release 1.0 of TimeseriesSurrogates.jl soon. Do you have any working code that you can push for a PR and we help you get it in?

We are also thinking of writing a JOSS paper about TimeseriesSurrogates a short while after the release, maybe you can also get in if you contribute a method here.

felixcremer commented 4 years ago

I have a nearly working version of the RandomFourier method based on nufft from the FastTransforms package. I am going to tidy it up today and will open a pull request so that we can push this further this week. I would also have a look at the LombScargle Method later this week.

Are there any tests, to make sure, that the surrogates we construct have the correct properties? I looked at the tests here, but they are only checking for the length of the surrogate.

We are also thinking of writing a JOSS paper about TimeseriesSurrogates a short while after the release, maybe you can also get in if you contribute a method here.

This sounds very interesting.

Datseris commented 4 years ago

In fact testing is an issue we are like you a bit unsure... We do not know exactly how to do "proper testing", since many of the "hypothesis testing" things that are discussed in the papers are rather subjective and not so much something you could do unit tests on.

One thing we should consider is sufficient numeric similarity between methods that retain autocorrelations.

Notice that me and @kahaaga in general test the methods with the systems showed in the papers before merging, and don't merge if we can't replicate a paper. So in a way you are "safe" if you are using the methods here, even if they don't have unit tests.

Datseris commented 4 years ago

p.s.: 1.0 is out, but we will wait a bit on the paper to have some more methods merged in.

kahaaga commented 4 years ago

Are there any tests, to make sure, that the surrogates we construct have the correct properties? I looked at the tests here, but they are only checking for the length of the surrogate.

For RandomFourier I would just "test" it by checking that the autocorrelation function for the surrogate and the original data are roughly equivalent (since that is the property the surrogates are supposed to preserve). You can use surroplot(x, s) for that, where x is the original time series and s is a surrogate realization. As you can see in the documentation, that's what we use to show that the methods work "well enough".

For now, you can just check this for a few well-behaved example time series (stationary, with at least some form of periodicity) and we can add the examples to the "example applications" docs page.

I have a nearly working version of the RandomFourier method based on nufft from the FastTransforms package. I am going to tidy it up today and will open a pull request so that we can push this further this week.

Nice! Looking forward to see what you've come up with!

Don't worry about having it in a perfect state, as long as the algorithm does roughly what it promises. Once you make the PR, we can make any necessary adjustments to the code.

If we get a working implementation of this going for RandomFourier, then I can quickly extend the AAFT, IAAFT and the truncated versions of all the Fourier based methods also for irregularly sampled data, which would be a huge improvement.

But let's focus on the RandomFourier first, and take it from there. We'll have to think a bit about how the api should handle the irregularly sampled data.

felixcremer commented 4 years ago

I have a nearly working version of the RandomFourier method based on nufft from the FastTransforms package.

Unfortunately, I was too optimistic with that. I have a rough draft ready, but unfortunately, I can't force the inverse non-uniform fourier transform to return a real valued time series and it doesn't have the nice symmetry properties, so that we can't use only one half of the frequencies for the shuffle to then mirror them to get a nice symmetric transform, which yields a real valued time series.

I also had a look at the IAAFT and there we could use a nufft to get the periodogram, but in the end this is conceptionally similar to the algorithm based on lombscargle described by Schmitz and Schreiber, that I started implementing that one.

Datseris commented 4 years ago

hehe, tough life is tough right? :D

kahaaga commented 4 years ago

Unfortunately, I was too optimistic with that. I have a rough draft ready, but unfortunately, I can't force the inverse non-uniform fourier transform to return a real valued time series and it doesn't have the nice symmetry properties, so that we can't use only one half of the frequencies for the shuffle to then mirror them to get a nice symmetric transform, which yields a real valued time series.

I also had a look at the IAAFT and there we could use a nufft to get the periodogram, but in the end this is conceptionally similar to the algorithm based on lombscargle described by Schmitz and Schreiber, that I started implementing that one.

We can have a look at that at a later stage. For now, it is just amazing that we can use irregularly sampled data at all. Nice job!