laszukdawid / PyEMD

Python implementation of Empirical Mode Decompoisition (EMD) method
https://pyemd.readthedocs.io/
Apache License 2.0
857 stars 222 forks source link

Why are the results obtained by reconstructing IMFs in EEMD so different from the original data? #119

Closed ki-ljl closed 2 years ago

ki-ljl commented 2 years ago

I tried to use EEMD for the electrical load data, but when I did the reconstruction I found the error was huge. However, this problem does not exist in EMD and CEEMDAN. Here is my code:

eemd = EEMD()
c = eemd(load)
imfs, res = c[:-1], c[-1]
x = np.sum(c, axis=0)
print(x)
print(load)

The output looks like this:

[387.19269426 380.57238899 382.93042187 ... 381.1966113  369.0875664
 382.90185408]
[193.987 187.12  189.514 ... 173.28  161.151 174.878]

Is this because of noise? Is there any way to get the real reconstructed data? Thanks in advance for everyone's help!

Haloukaidi commented 2 years ago
def eemd(self, S: np.ndarray, T: Optional[np.ndarray] = None, max_imf: int = -1) -> np.ndarray:
    if T is None:
        T = get_timeline(len(S), S.dtype)

    #scale = self.noise_width * np.abs(np.max(S) - np.min(S))

    self._S = S/np.std(S)
    self._T = T
    self._N = len(S)
    self._scale = self.noise_width
    self.max_imf = max_imf

    # For trial number of iterations perform EMD on a signal
    # with added white noise
    if self.parallel:
        pool = Pool(processes=self.processes)
        all_IMFs = pool.map(self._trial_update, range(self.trials))
        pool.close()

    else:  # Not parallel
        all_IMFs = map(self._trial_update, range(self.trials))

    self._all_imfs = defaultdict(list)
    for (imfs, trend) in all_IMFs:
        if trend is not None:
            self._all_imfs[-1].append(trend)

        for imf_num, imf in enumerate(imfs):
            self._all_imfs[imf_num].append(imf)

    self._all_imfs = dict(self._all_imfs)
    if -1 in self._all_imfs:
        self._all_imfs[len(self._all_imfs)] = self._all_imfs.pop(-1)

    for imf_num in self._all_imfs.keys():
        self._all_imfs[imf_num] = np.array(self._all_imfs[imf_num])

    self.E_IMF = self.ensemble_mean() * np.std(S)
    self.residue = S - np.sum(self.E_IMF, axis=0)

    return self.E_IMF

I think it's the noise, too. Try this, and remeber change self.noise_width to 0.01-0.2(default : 0.2)

ki-ljl commented 2 years ago

Thanks for your reply, but it doesn't work with my data.

Haloukaidi commented 2 years ago
eemd = EEMD()
eemd.eemd(load)
eemd_imfs, eemd_res = eemd.get_imfs_and_residue()
recons= np.sum(eemd_imfs, axis=0) + eemd_res 
print(recons)

I am so sorry for replying without looking at the question carefully. I thought you had the same problem as I had before. I think the problem is that when you write code this way, eemd returns only IMFs, not Res.

laszukdawid commented 2 years ago

I'm going to refer you back to textbooks where it's explained the EEMD does. The won't reconstruct signal by summing all results from EEMD.

EEMD is performing hundreds times EMD on a signal plus some noise and then sums them component-wise. The hope was that sum of all noise decompositions would result in zero. But it rarely does.

https://pyemd.readthedocs.io/en/latest/eemd.html

On Tue, 17 May 2022, at 21:40, Haloukaidi wrote:

eemd = EEMD() eemd.eemd(load) eemd_imfs, eemd_res = eemd.get_imfs_and_residue() recons= np.sum(eemd_imfs, axis=0) + eemd_res print(recons)

I am so sorry for replying without looking at the question carefully. I thought you had the same problem as I had before. I think the problem is that when you write code this way, eemd returns only IMFs, not Res.

— Reply to this email directly, view it on GitHub https://github.com/laszukdawid/PyEMD/issues/119#issuecomment-1129558887, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACXNLK6KLP5KFDEPU6FISN3VKRYCFANCNFSM5WDO7LHQ. You are receiving this because you are subscribed to this thread.Message ID: @.***>

M73ACat commented 2 years ago

This answer should solve your problem and I will try to explain this issue further.

I'm going to refer you back to textbooks where it's explained the EEMD does. The won't reconstruct signal by summing all results from EEMD. EEMD is performing hundreds times EMD on a signal plus some noise and then sums them component-wise. The hope was that sum of all noise decompositions would result in zero. But it rarely does. https://pyemd.readthedocs.io/en/latest/eemd.html

First of all, it is true that the presence of noise causes this thing called reconstruction error. the general process of EEMD is like this: add noise to original data, decompose it using EMD, add new noise, EMD, and so on hundreds times. Although EEMD attempts to eliminate the added noise by averaging, the results sometimes are not as good as they should be. Therefore, some improved methods such as CEEMD and CEEMDAN emerged, which can achieve better results and fewer reconstruction errors compared to the original EEMD method.

As for how to get the real reconstructed data, it comes to the question of how to use the EEMD decomposition results. Most existing papers mainly use the EEMD decomposition results in two ways: 1. Only select the most representative IMF component (manual selection, correlation coefficient, other evaluation indexes, etc.); 2. Select several IMF components according to certain indexes. Adding up all the IMF components seems to have little practical value and is only used in some papers to evaluate the denoising ability of the EEMD series methods.

However, in this case, the difference between the original signal and the reconstructed signal is surprisingly large. When I try to reproduce your operation, I found that your data is large too. Considering that the default noise amplitude is 0.05* (max(S) - min(S)), it introduces the noise with a huge noise amplitude. However, the default ensemble time 100 cannot deal with such large noise so it appears large error between your original data and the reconstructed data.

Here is a case from my area. It shows the original data decomposition result, the amplified (50 times) signal result and the amplified signal result with 1000 ensemble times respectively. It is clear that the reconstructed result is improved by increasing ensemble times. Therefore, maybe you can scale the raw data appropriately or increase the ensemble times to reduce the error. normal signal Fig. 1 The original data decomposition result with 100 trails 50 times signal Fig. 2 The 50 times signal decomposition result with 100 trails 50 times signal with 1000 trials Fig. 3 The 50 times signal decomposition result with 1000 trails

ki-ljl commented 2 years ago

Thanks for your suggestion, I will try this.