PyWavelets / pywt

PyWavelets - Wavelet Transforms in Python
http://pywavelets.readthedocs.org
MIT License
1.97k stars 460 forks source link

How to deal with NAN #617

Closed lemon234071 closed 2 years ago

lemon234071 commented 2 years ago

Hey, thanks amazing lib. I try to use it to denoising my signal. But there some np.nan in my signal data, like [1, 2, 3, np.nan, 4, 5....]

coeffs = pywt.wavedec(data, "db4")
coeffs[i] = pywt.threshold(coeffs[i], threshold_value)
meta = pywt.waverec(coeffs, "db4")

The output will contain more np.nan. Is there any methods to reduce this impact, i.e. add no np.nan into the signal? (E.g. methods to process the raw signal or the reconstructed signal)

rgommers commented 2 years ago

There is no built-in functionality in PyWavelets. You just need to filter out the nan's, e.g. by replacing them with the interpolated value between neighboring non-nan values. In general this is a hard problem, so the choice and implementation of solution will depend a bit on your use case.

lemon234071 commented 2 years ago

There is no built-in functionality in PyWavelets. You just need to filter out the nan's, e.g. by replacing them with the interpolated value between neighboring non-nan values. In general this is a hard problem, so the choice and implementation of solution will depend a bit on your use case.

Thanks, and, I just wonder which part (operation or formula) cause the NAN in the reconsturcted signal? I will decide to filter out nan value or pad it depend on the source reason, since the signal figure is very different of two processing way.

rgommers commented 2 years ago

I just wonder which part (operation or formula) cause the NAN in the reconsturcted signal?

It's hard to point at a single line of code or formula. Going from the time to the frequency domain is in general ill-specified in the presence of nan's.

lemon234071 commented 2 years ago

I just wonder which part (operation or formula) cause the NAN in the reconsturcted signal?

It's hard to point at a single line of code or formula. Going from the time to the frequency domain is in general ill-specified in the presence of nan's.

Okay, i see , thank you very much.

grlee77 commented 2 years ago

I just wonder which part (operation or formula) cause the NAN in the reconsturcted signal?

The basic cause is that the forward and inverse transforms involve convolutions with some sets of wavelet filters. For the db4 case, the filters have 8 coefficients and at each level any Nan value would spread out by an amount equal to the filter footprint! This will occur again during reconstruction so you would expect to see many more NaN in the reconstructed signal than in your source. This is why @rgommers suggestion to somehow replace NaN values prior to applying the transforms seems like the best bet