jaidevd / pyhht

Python toolbox for the Hilbert-Huang transform
Other
254 stars 88 forks source link

more meaningful representation of orthogonality #51

Closed ggrrll closed 3 years ago

ggrrll commented 6 years ago

Hi,

I was thinking that in roder to provide a more meaningful orthogonality measure it might be useful for instance, to normalise the ortogonality (io()) by the N of IMFs combinations of which is computed (you can use for instance scipy comb)

what do you think? -- maybe we should ask to a broader audience ...

jaidevd commented 6 years ago

Hi @ggrrll

Orthogonality exists only for a pair of vectors, hence the index of orthogonality is computed by summing up the inner products of all pairs of IMFs. So the combinations are always taken two at a time without replacement. So this would amount to simply dividing the IO by 2.

Moreover, the index of orthogonality is not an absolute measure, it is relative - lower values are better. For example, consider a signal made up of two waves of 5 and 10 Hz respectively, with the higher frequency wave having half the amplitude of the lower frequency wave:

t = np.linspace(0, 1, 1000)
mode1 = np.cos(2 * np.pi * 5 * t)
mode2 = 0.5 * np.cos(2 * np.pi * 10 * t)
signal = t + mode1 + mode2
plt.plot(t, signal)

image

Now the ideal decomposition would give us the two modes back, like so: image

Suppose we construct decomposition, where we have four elements in total (instead of the two in the ideal case), which are made by halving the amplitude of the original modes. This is a bad decomposition, since it is not separating frequencies, but it is dividing amplitudes. Like so: image

The good decomposition gives us an IO of around 0.0005, and the bad one gives us an IO of ~ 0.16.

So relatively, the second one is bad. That's all that we need as an indicator of how orthogonal the resulting IMFs are.

Hope that helps,

ggrrll commented 6 years ago

Hi @jaidevd thanks for the answer. The 2nd part help understanding what's going on.

For the 1st part: as indeed io is defined as a sum over all pairs of vectors, I proposed to normalized over C(N,2), to account for the fact that different signals might have different number of IMFs. In such way we would have a measure that make sense to compare across different signals...

(Not sure I was clear enough ...does it makes sense to you? )

Actually, considering that you are not taking into account the same couples, you should indeed also divide by 2, as you suggested (to avoid this you could use itertools.combinations)

jaidevd commented 6 years ago

Hi,

Even if different signals have different number of IMFs, IO is calculated for only one isolated set of IMFs belonging to the same signal. Taking the example from above, suppose I call the two sets of IMFs as I1 and I2. The first has two IMFs, and second has 4 IMFs. So I should divide IO(I1) by C(2, 2) = 1 and IO(I2) by C(4, 2) = 12. That would give us values of 0.0005 and 0.013. Is this what you are suggesting?

If that is the case, then it is actually disturbing the values, since then the IO becomes inversely proportional to the number of pairwise combinations of the IMFs. Which means that, all other things being fixed, more IMFs will lead to smaller IO. That doesn't make sense logically.

Your second point is correct. Since we are counting each pair twice, we have to divide by 2, which is happening in the current implementation here: https://github.com/jaidevd/pyhht/blob/dev/pyhht/emd.py#L204

jaidevd commented 6 years ago

We cannot have the index of orthogonality inversely proportional to the number of IMFs, because then, I can simply go on increasing the number of IMFs to make the IO small. Simply adding more IMFs is no guarantee of orthogonality.

ggrrll commented 6 years ago

If that is the case, then it is actually disturbing the values, since then the IO becomes inversely proportional to the number of pairwise combinations of the IMFs. Which means that, all other things being fixed, more IMFs will lead to smaller IO. That doesn't make sense logically.

Well, if the definition given in the doc is what is actually being computed, then clearly io is already scaling with the N of modes ...that's why I was thinking to introduce a reasonable normalization... - we could discuss of course if that's a reasonable one (and indeed, would be nice if this could be brought at higher level, to academics in the field, etc... much more expert than me :) )

Here are attached two plots showing the effects of the normalisation, from the data I am working on (the N of 'signals' is quite limited, 42, so much more statistics and theoretical analysis is clearly needed to assess such suggestion)

io_vs_nimfs normalized_io_vs_nimfs

jaidevd commented 6 years ago

Well, if the definition given in the doc is what is actually being computed, then clearly io is already scaling with the N of modes

How did you infer this? According to that formula the inner product of the set of IMFs is being divided only by the squared modulus of the original signal.

I'll be more than happy to discuss this with others. Do you have anyone in mind who can help us out?

Thanks,

ggrrll commented 6 years ago

How did you infer this?

due to the sum over all pairs -- as you also said in the first comment. This would be a 'extrinsic' scaling...Now, whether there is an 'intrinsic' scaling of || C_{i} C{j} || with i and j... I have no idea ...

Do you have anyone in mind who can help us out?

no, unfortunately no idea...I am totally new to this technique (I also posted a similar question here in SE-CV, in case...)

My first guess would be the authors of the the relevant papers on HHT...

jaidevd commented 6 years ago

Summing over pairs simply ensures that C(N, 2) different numbers are being summed to produce the IO. There is no scaling or normalization effect.

Can you access the original paper and see if there's something to this effect?

ggrrll commented 6 years ago

yeah, so that's why IO(N) 'scales' with C(N,2) ;) (pardon the physicist's language)

I am asking a similar question in 'signal processing' , which seems to be more appropriate community than 'cross-validated' in SE, for this topic

jaidevd commented 6 years ago

Can you share the link, please?

ggrrll commented 6 years ago

sure, sorry I forgot -- here it is

jaidevd commented 3 years ago

Closing for lack of activity.