Closed krisvanneste closed 2 years ago
Hi Kris and thanks for the feedback!
What you propose is very interesting.
I'm going to create a branch so that we can test this idea.
Claudio, Thanks for looking into this!
Kris
I'm trying right now with this:
def _smooth_spectrum(spec):
"""Smooth spectrum in a log_freq space."""
freq = spec.get_freq()
f = interp1d(freq, spec.data, fill_value='extrapolate')
_freq_log = np.log10(freq)
# minimum frequency interval in log10
log_df = _freq_log[-1] - _freq_log[-2]
# frequencies in logarithmic spacing
freq_log = 10**(np.arange(_freq_log[0], _freq_log[-1]+log_df, log_df))
spec.freq_log = freq_log
spec.data_log = f(freq_log)
# number of points for smoothing
log_bandwidth = _freq_log[-1] - _freq_log[0]
npts = max(1, int(round(log_bandwidth / log_df)))
spec.data_log = smooth(spec.data_log, window_len=npts)
f = interp1d(freq_log, spec.data_log, fill_value='extrapolate')
spec.data = f(freq)
But npts
is of the same order of len(freq_log)
, which makes the spectrum way over smoothed...
Maybe I misunderstood what you mean for "bandwidth in log space" (log_bandwidth
here)?
Claudio,
log_bandwidth
should be an argument of the _smooth_spectrum function (replacing npts
). I use a default value of 0.05, which seems to give good results, but it should be configurable by the end user.
Kris
I just pushed the branch https://github.com/SeismicSource/sourcespec/tree/smooth_spectrum
The main difference with your code is that I set the number of points in logspace
to be equal to the points of the original spectrum: I found that your approach was generating way too many spectral points, which has a heavy cost for grid search inversion.
Otherwise, I think that introducing this new parameter (which I named spectral_smooth_width_decades
) is a clearer approach. 😉
Could you please test this branch and tell me your opinion?
Claudio,
Thanks for updating your code. I can't run sourcespec to evaluate the impact on the magnitude estimation right now, but I will try to evaluate the spectral smoothing by tomorrow, if that's OK.
Kris
No worry! Take your time!
Just pushed a new commit https://github.com/SeismicSource/sourcespec/commit/16949b12a0887adce532fba37df5c8ea204fd8d3
Here I resample the smoothed log-spectrum using a sampling rate that is 1/5 of the smooth window size.
In my tests, it is a sufficient number of points to correctly represent the smoothed spectrum.
Let me know what do you think.
Claudio,
I compared your new smoothing function with mine and with your previous one in the attached plot.
With the same bandwidth (smooth_width_decades
), the result of the new smoothing function is very close to mine (labeled "robspy" in the plot). With the default value of 0.2, it is close to your original function. Anyway, the new function can be configured much better than the original one, which smoothed a great deal more than you would expect with a smoothing window (npts
) of only 5 points.
The only potential problem I see is that in your spec object, the linear spectrum (spec.freq, spec.data) has a different resolution compared to the logarithmic spectrum (spec.freq_log, spd.data_log). I hope this doesn't cause any problems later on in the process.
Kris
Hi Kris and thanks for this test!
It looks like I prefer smoothing more than you 😉. So it's good to have a parameter for that!
The linear spectrum is not used anymore, except for plotting. But I think I'll replace spec.data
with spec.data_log
also for plotting, so that one can visually judge the effect of the downsampling.
I'll do this latest modification later today and then merge the smooth_spectrum
branch.
Thanks again!
Claudio,
OK, thanks for addressing this! And yes, maybe I need to smooth a little more...
Kris
I think there is a problem in the _smooth_spectrum function. When the spectrum is converted from linear space to log space, the spectrum is first decimated to 50 points (the default number of values generated by np.logspace) before the actual smoothing is applied.
To avoid loosing nformation, it would be much better to generate the freq_log sequence with the smallest log_df spacing in the linear series, as follows:
log_df = np.diff(np.log10(freq[-2:]))[0]
freq_log = 10**(np.arange(np.log10(freq[0]), np.log10(freq[-1]) + log_df, log_df))
When you do that, however, the 'npts' argument doesn't make much sense anymore. It would be better to define the amount of smoothing in terms of bandwidth in log space. The number of points can then be computed from the bandwidth as follows:
npts = max(1, int(round(bandwidth / log_df)))
https://github.com/SeismicSource/sourcespec/blob/3ad78e9b3e00c64676fefb676cfec7e5f259dcc1/sourcespec/ssp_build_spectra.py#L216