PyWavelets / pywt

PyWavelets - Wavelet Transforms in Python
http://pywavelets.readthedocs.org
MIT License
1.97k stars 460 forks source link

MAINT: add input validation to `scales` argument to `cwt` #703

Closed cyschneck closed 3 months ago

cyschneck commented 4 months ago

Overview

Warning for when using invalid scale range that includes zero which is easy to accidentally do when setting up scales dynamically and would be nice to fail gracefully.

Bug and Steps to Reproduce

When running an invalid scale range (that includes zero), pywt throws an unrelated IndexError

time, sst_data = pywt.data.nino()
scales = [0, 1, 2]
wavelet_coeffs, freqs = pywt.cwt(data=sst_data,
                                 scales=scales,
                                 wavelet="morl")

Throws:

pywt/_cwt.py:154, in cwt(data, scales, wavelet, sampling_period, method, axis)
    152 if j[-1] >= int_psi.size:
    153     j = np.extract(j < int_psi.size, j)
--> 154 int_psi_scale = int_psi[j][::-1]
    156 if method == 'conv':
    157     if data.ndim == 1:

IndexError: index -9223372036854775808 is out of bounds for axis 0 with size 1024

This error actually originates on pywt/_cwt.py:150 when attempting to divide by the scale. If the scale range includes zero then it will attempt to divide by zero. However, this error is hidden by casting with astype.

For example, the example below should throw a divide by zero issue, but doesn't. It instead returns an empty array. As a result, trying to index by the empty array on line 154 will throw IndexError: index -9223372036854775808 is out of bounds for axis 0 with size 1024

For example:

j = np.arange(0 * (28-0)) / (0 * 1)
j = j.astype(int)
print(j[::-1])
--> Returns: []

Solution

Throw relevant ValueError when attempting to use a scale that includes zero

cyschneck commented 3 months ago

With the unconditional conversion for scales

scales = np.asarray(scales)

I added a ValueError if the scale was not given as a array_like (list/np.array), which enforces the array_like expected unit type from the docs

# convert array_like scales to a np.array
if not isinstance(scales, list) and not isinstance(scales, np.ndarray):
    raise ValueError("scales should be an array_like, list or np.array")
scales = np.asarray(scales)
rgommers commented 3 months ago

I added a ValueError if the scale was not given as a array_like (list/np.array), which enforces the array_like expected unit type from the docs

Note that array-like includes scalars (and anything else that np.asarray can convert to a numpy array), so this last change is not correct and would be a backwards-compatibility break.

The problem from the second commit was:

FAILED ..\tests\test_cwt_wavelets.py::test_cwt_small_scales - TypeError: iteration over a 0-d array
FAILED ..\tests\test_matlab_compatibility_cwt.py::test_accuracy_precomputed_cwt - TypeError: iteration over a 0-d array

So probably the right solution is to use np.atleast_1d(scales).

rgommers commented 3 months ago

I pushed a rebase with that suggestion - let's see if we can get this merged:)