nfsi-canada / OBStools

Tools for processing broadband ocean-bottom seismic data
https://nfsi-canada.github.io/OBStools/
MIT License
52 stars 35 forks source link

utils.calc_windowed_fft() does not return same number of windows as scipy.signal.spectrogram() #35

Closed WayneCrawford closed 2 years ago

WayneCrawford commented 3 years ago

When I run atacr.DayNoise.QC_daily_spectra on a stream containing 1728001-sample traces, I get the following error:

Traceback (most recent call last):
  File "run_corrections.py", line 97, in <module>
    main()
  File "run_corrections.py", line 42, in main
    out_streams['ATACR w/o pre-rotation'] = run_atacr(stream)
  File "run_corrections.py", line 72, in run_atacr
    daynoise.average_daily_spectra(fig_average=True)
  File "/Users/crawford/_Work/Programming/OBStools/obstools/atacr/classes.py", line 664, in average_daily_spectra
    np.mean(ftZ[self.goodwins, :]*np.conj(ftZ[self.goodwins, :]),
IndexError: boolean index did not match indexed array along dimension 0; dimension is 341 but corresponding boolean dimension is 342

I see that self.goodwins is based on the length of "t" returned by spectrogram() and that ftZ is calculated using utils.calc_windowed_fft()

This happens with multi-day data, which of course isn't logical for DayNoise(), but I suspect it would happen with daily data with a "high" sampling rate (20+sps). I can get rid of the problem by reducing the Trace length, but it appears in any case that spectrogram() and utils.sliding_window() don't calculate exactly the same windows, so the windows calculated for calc_windowed_fft() do not correspond exactly to those returned by spectrogram().

I leave it to you to decide if this is important or not.

Regards Wayne Crawford

paudetseis commented 2 years ago

Thank you for raising this issue. This was fixed in a recent PR #36 where the windowed fft is now calculated using the scipy.signal.stft function, and the frequency axis is defined using the scipy.fft.rfftfreq function, thus ensuring the windows are the same as those obtained using scipy.signal.spectrogram.

paudetseis commented 2 years ago

The code has now been further simplified to use only the scipy.signal.stft function and avoid un-necessary calls to scipy.signal.spectrogram and the rfftfreq function. To avoid handling special cases, the full spectra (positive and negative frequency axis) are calculated and retained. The saved quantities now require more disk space, but the code is simpler and more robust. Version 0.1.3 will soon be released and packaged for PIPy.