STFT: streaming peformance, output shapes

OverLordGoldDragon / ssqueezepy

Synchrosqueezing, wavelet transforms, and time-frequency analysis in Python

MIT License

599 stars 92 forks source link

STFT: streaming peformance, output shapes #81

Closed falseywinchnet closed 1 year ago

falseywinchnet commented 1 year ago

Hi, for my project https://github.com/falseywinchnet/streamcleaner I have previously been using the librosa stft. However, after i was able to make the ssqueezepy stft behave similar to it by appending 128 samples and then slicing all but the last 127 from the istft output, I switched over to ssqueezepy, believing that for my purposes https://github.com/librosa/librosa/issues/1279 this was an important caveat I should consider.

However, librosa's stft only consumes ~2% CPU, and the ssqueezepy on a similar workload takes up 25%. I'm concerned this is due to my crude attempt at making the two behave similar, when doing this the STFT representation is MSE very close to librosas.

Is there a proper way to make the stft of ssqueezepy generate a 257x376 complex representation from 48,000 samples and return 48,000 samples and perform with similar compute requirements?

falseywinchnet commented 1 year ago

It was as I feared, it was due to me adding an extra frame. I removed the extra samples and my algorithm appears to work the exact same as it did before the change- perhaps even better. however, it does generate a slightly different size of STFT.

falseywinchnet commented 1 year ago

no, i was wrong. it still doesn't work right. That is to say, i didnt bother seeing if it dropped samples- i assumed the reduction in cpu use was due to a corrected algorithm. However, it now returns 47999 samples, which causes my program to quietly hang.

falseywinchnet commented 1 year ago

Let's just make this much simpler: dear john, is there a simple change I can make to librosa's handling as a post-processing step on their STFT which will deliver the changes recommended in https://github.com/librosa/librosa/issues/1279 ?

OverLordGoldDragon commented 1 year ago

Hello,

is there a simple change I can make to librosa's handling as a post-processing step on their STFT which will deliver the changes recommended in https://github.com/librosa/librosa/issues/1279

By modifying librosa.stft or as a postprocessing step, yes; both codes may need one-sample adjusting to handle different configs

extra frame.

Note librosa's output size is suboptimal

ssqueezepy on a similar workload takes up 25%

The default behavior is multiprocessing for speedup, can turn off via os.environ['SSQ_PARALLEL'] = '0'

Is there a proper way to make the stft of ssqueezepy

For inquiries like this it helps to have a minimally reproducing code of obtained vs desired behavior.

falseywinchnet commented 1 year ago

well, at this time i'm also being strongly cautioned that i need to start looking into what is called "online" stft where analysis(the STFT part) is done into a ringbuffer and the analysis(istft) is done via overlap-add, potentially on different chunk sizes, so my algorithm can become realtime.

However, I wanted to do the best job I could with simple one second at a time frames of 48000 samples using a FFT of 512 and a hop length of 128, before I attempted to work on such a massive change. My first obstacle was that no matter whose alternate STFT implementation I used, they returned 47999 samples instead of 48000 when attempting to generate a similar STFT and reverse it.

OverLordGoldDragon commented 1 year ago

There's a 1-sample ambiguity per integer rounding, hence N= in istft must be specified:

x = np.ones(48000)
kw = dict(n_fft=512, hop_len=128)
assert len(x) == len(istft(stft(x, **kw), **kw, N=len(x)))

However, I just noticed, there's a precision issue with float32 and time-localized windows (which the default window=None happens to often qualify), as the 1 less frame in ssqueezepy involves dividing by the window tail. This only affects a small portion of the rightmost boundary, but still should be documented/warned about.

OverLordGoldDragon commented 1 year ago

From a feature standpoint this also makes librosa's oversampling preferable for some configurations, yet still less for others; both libraries determine the number of STFT frames independent of window shape, which isn't optimal, but not really a big deal either. Maybe future TODO.

OverLordGoldDragon commented 1 year ago

Appears resolved.