helske / Rlibeemd

Ensemble Empirical Mode Decomposition (EEMD) and Its Complete Variant (CEEMDAN)
37 stars 12 forks source link

ceemdan(): default num_imfs=0 does not necessarily result in maximum number of IMFs #3

Open ghost opened 6 years ago

ghost commented 6 years ago

I came across an issue with the ceemdan() function parameter setting num_imfs. In the documentation, the default values of num_imfs=0 is said to correspond to a

maximal number of IMFs

. However, depending on other parameter settings (e.g. noise_strength), this is not necessarily the case. A residual with two non-edge extrema, which could be further decomposed, is possible. Then (for N>3), not num_imfs=emd_num_imfs(N) (which seems to take floor(log2(N)) as the num_imfs-value), but the smallest integer which is not less than log2(N), thus, num_imfs=ceiling(log2(N)), appears to correspond to the maximal number of IMFs (meaning the residual has maximum one non-edge extremum). Whether floor(log2(N)) or ceiling(log2(N)) corresponds to the maximal number of IMFs depends on other parameter settings such as noise_strength. Maybe the number of non-edge extrema should be checked to decide on either floor(log2(N)) or ceiling(log2(N)) for num_imfs.

Attached I provide an example. An NDVI (Normalised Difference Vegetation Index) time series of length 340 (example_NDVI.txt). It is decomposed in R using ceemdan() (ceemdan_issue.txt). When setting a noise_strength=0.3, the default setting of num_imfs=0 results in a residual with two non-edge extrema.

I hope I am not missing anything/getting wrong and this information is of help. In case it is a bug, an edit would be appreciated.

These are my sessionInfo:

R version 3.4.4 (2018-03-15) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

packageVersion("Rlibeemd") [1] ‘1.4.0’

Best, Katharina

luukko commented 6 years ago

Hi Katharina. I'm the author of libeemd, which Rlibeemd uses under the hood. The floor(log2(N)) value is taken from the literature, but I'm not surprised that sometimes one more would be appropriate. The tricky thing is that libeemd allocates memory for the output array in the beginning of the computation, so it's not easy to make the number of output IMFs conditional on properties of the residual. What we could do is allocate memory for ceil(log2(N)) IMFs, and not return the last one if it is zero. I currently have little time to work on libeemd, so some help would be appreciated.

Meanwhile, I think you could work around this issue by checking the number of interior extrema in the residual and decompose it further if needed.