laszukdawid / PyEMD

Python implementation of Empirical Mode Decompoisition (EMD) method
https://pyemd.readthedocs.io/
Apache License 2.0
867 stars 224 forks source link

Spending endless time on running the EEMD example #83

Closed onion5376 closed 3 years ago

onion5376 commented 3 years ago

I tried to run the EEMD example scripts from url but it was still running without any results and errors after 10 hours with my PC. The code used was listed below:

from PyEMD import EEMD
import numpy as np
import pylab as plt

# Define signal
t = np.linspace(0, 1, 200)

sin = lambda x,p: np.sin(2*np.pi*x*t+p)
S = 3*sin(18,0.2)*(t-0.2)**2
S += 5*sin(11,2.7)
S += 3*sin(14,1.6)
S += 1*np.sin(4*2*np.pi*(t-0.8)**2)
S += t**2.1 -t

# Assign EEMD to `eemd` variable
eemd = EEMD()

# Say we want detect extrema using parabolic method
emd = eemd.EMD
emd.extrema_detection="parabol"

# Execute EEMD on S
eIMFs = eemd.eemd(S, t)
nIMFs = eIMFs.shape[0]

# Plot results
plt.figure(figsize=(12,9))
plt.subplot(nIMFs+1, 1, 1)
plt.plot(t, S, 'r')

for n in range(nIMFs):
    plt.subplot(nIMFs+1, 1, n+2)
    plt.plot(t, eIMFs[n], 'g')
    plt.ylabel("eIMF %i" %(n+1))
    plt.locator_params(axis='y', nbins=5)

plt.xlabel("Time [s]")
plt.tight_layout()
plt.savefig('eemd_example', dpi=120)
plt.show()

I was puzzled with this endless running, and had no idea of how to solve this problem. can you give some help? Thanks.

The details of platform used was listed below:

laszukdawid commented 3 years ago

Hey @onion5376,

Interesting that it fails on the example. I'll take a look into this.

Long story short, there is a huge number of proto-IMFs for each noise assisted signal and it takes ages to decompose it. A quick verification fix would be to limit number of eIMFs. Try:

eIMFs = eemd.eemd(S, t, max_imf=2)

It should finish relatively quickly. Once you can see results feel free to increase the max_imfs by a little and check if results are acceptable.

It's unlikely that your specs have to do with this but I really do appreciate that you've written them out as it usually helps. I'm expecting that culprit is either 1) me not running this example in a long time, 2) dependencies for PyEMD has changed something, or 3) your computer doesn't have enough "something" for this semi-demanding task. The option 3 is unlikely, though.

laszukdawid commented 3 years ago

Ok, just tested this example on my laptop and it took about 5s to compute and display graphs.

Let me know what's the result when you limit the number of imfs. Also, if that takes too long, feel free to try simple EMD first with

from PyEMD import EMD
emd = EMD()
imfs = emd(s, max_imf=1)

and again increase gradually the max_imfs. In my case, there were 3 imfs in total, so after max_imf=3 there won't be any difference.

onion5376 commented 3 years ago

Thanks @laszukdawid. Before running the EEMD script, the simple EMD example could be executed successfully in my PC. Following your instructions, I attempted to limited the parameter of "max_imf". When the max_imf was set to 2, running the EEMD script aboved again. It still got into endless running, and the CPU usage immediately climbed to 100% for 8 CPU cores. I did not figure out this problem.

PC hardware: CPU: i7-4810MQ, 2.8GHZ, 8 cores RAM: 16g PC Name: Dell M4800

laszukdawid commented 3 years ago

Ok, I think I know the where the problem is. As expected, your computer specs are great and you should be able to run any highly demanding computation. Which I'm actually guess that's why it doesn't work. Well, not exactly you, but something on your computer. The problem is likely in that EEMD tries to utilize all of your CPU cores but likely not all are available. Let's try first with disabling parallelization and then limiting number of processes.

1) Disable parallelism.

eemd = EEMD(parallel=False)
eIMFs = eemd(S, t, max_imf=2)

2) Limit number of processes

eemd = EEMD(processes=2)  # default: parallel=True
eIMFs = eemd(S, t, max_imf=2)

I'd imagine that 4 would be the optimal number of processes to run. But, try step by step, first.

Also, if that doesn't work, you can start with simpler cases and build you it up. Start with low number of trails (ensemble is made out of trials) and tiny bit of noise noise_width. As you can see in the code (doc should have the same values), the default are trials=100 and noise_width=0.05. Start with trials=1 and noise_width=0.0005. That's something like

eemd = EEMD(trials=1, noise_width=0.0005)
eIMFs = eemd(S, t, max_imf=2)

Let me know how that goes :)

onion5376 commented 3 years ago

According to your suggestion, It seems that parallelism maybe result in endless running of scripts. (1) Disable parallelism, eemd = EEMD(parallel=False) After closing parallelization, it could run successfully just in a few seconds regardless of the parameter 'max_imf'.

(2) enable parallelism, eemd = EEMD(processes=2) Opening parallelization and setting processes to 2, it semms to have the same problem as before.

laszukdawid commented 3 years ago

In such a case I might need to change the default behaviour not to use parallel setting. It'd be easier to first run and then consider how to make it faster, rather than first run and think why it doesn't work.

In your particular case I think the problem is shared between the PyEMD implementation, multiprocessing in Python and what processes you run in addition to running the PyEMD. I'm currently doing a bit of cleaning here so might take a look into this. However, don't expect much ;-)

To me it seem we're done with this issue as you have a solution. I'm closing this ticket but feel free to reopen if you find that it didn't work.