Possible issue with mwcs_step

asyates commented 2 years ago

Maybe I'm missing something obvious, but I noticed some strange behaviour when adjusting values of mwcs_step while using the mwcs() function in a separate script (pulled from MSNoise, from move2obspy.py).

For two waveforms, I was noticing that the approx coherence at a given lag time seemed to change just as a result of changing the mwcs_step value (with a fixed window length)... which seemed surprising. I've attached a figure showing different plots of coherence with different step values (window length fixed at 4s). You can see that it looks like, with smaller step size, the initial high coherence around zero lag time is 'shifting' to the right. By the time the step is 0.25s, it is entirely within the positive lag times and at 0.1s step size, if it's the same, appears to be at ~30s lag time.

I haven't taken a look at the code in more detail yet (about to leave the office), and it's possible there is something I am missing (i.e. something I haven't accounted for when I've pulled it from MSNoise)... but figured worth putting here in the mean time.

Approx code I am using below, with mwcs() function as in MSNoise except for changing returned output to return each array individually rather than single numpy array):

st1 = read(refdir1+fname1)
st2 = read(refdir1+fname2)

mwcs_low=1.1
mwcs_high=1.9
goal_sampling_rate=25
maxlag=120
mwcs_wlen=4
mwcs_steps=[4,2,1,0.5,0.25,0.1]

samprate = 1.0/goal_sampling_rate
lagtimes = np.arange(-1*maxlag, maxlag+samprate, samprate)

fig, ax = plt.subplots(len(mwcs_steps)+1, 1)

ax[0].plot(lagtimes, st1[0].data)
ax[0].plot(lagtimes, st2[0].data)

for i, step in enumerate(mwcs_steps):
    [time_axis, delta_t, delta_err, delta_mcoh] = mwcs(st1[0].data, st2[0].data, mwcs_low, mwcs_high, goal_sampling_rate, -maxlag, mwcs_wlen, step)

    ax[i+1].scatter(time_axis, delta_mcoh, color='blue', label='mwcs step = '+str(step))
    ax[i+1].set_ylabel('coherence')
    ax[i+1].legend()

for a in ax:
    a.set_xlim(-maxlag,maxlag)

plt.show()

mwcs_step_possible_issue_120s

asyates commented 2 years ago

Just having a quick look, believe the issue is related to the sampling freq (25 Hz) * mwcs_step not being an integer. Seems fine for steps 1,2,4, but goes wrong at lower steps where the multiplication is no longer an integer.

I believe it is because we are slicing the waveforms based on an integer converted value of samp_freq * mwcs_step, i.e:

        minind += int(step * df)
        maxind += int(step * df)

so, for a step of 0.5 seconds, the value of step * df will be 12.5, which will then be rounded down to 12. So the step is essentially 0.48s instead. At the same time though, we are incrementing the time axis by the original step of 0.5, i.e.:

        time_axis.append(tmin + window_length / 2. + count * step)

which would then produce a shift, which becomes more extreme the more steps you have (as the mismatch between the time we are slicing and the time_axis appended time is becoming larger.

So, I guess the solution is to append the time_axis to correspond to the integer value of mwcs_step * df instead.

ThomasLecocq commented 1 month ago

is this corrected ? https://github.com/ROBelgium/MSNoise/pull/253 ?

asyates commented 1 month ago

Looks like no, same issue when mwcs_step * sampling rate is non-integer.

See example xarray output below for data with sampling interval 0.01 s (100 Hz), with requested step of 0.025s. taxis starts at -3.9s (maxlag=4), but extends to 5.85 s as the maxind is incrementing by 2 samples i.e. 0.02s step (rounded down from 2.5) due to this line:

minind += int(f.mwcs_step * goal_sampling_rate)

Dimensions: (times: 2333, taxis: 391, keys: 3) Coordinates: * times (times) datetime64[ns] 2024-03-27T23:45:10 ... 2024-03-28T06:13:50 * taxis (taxis) float64 -3.9 -3.875 -3.85 -3.825 ... 5.775 5.8 5.825 5.85 I can make the same fix (more or less) as before to correct the taxis output to be representative of the true step e.g. this is the output with the fix: In [1]: run read_mwcs_output.py Dimensions: (times: 2333, taxis: 391, keys: 3) Coordinates: * times (times) datetime64[ns] 2024-03-27T23:45:10 ... 2024-03-28T06:13:50 * taxis (taxis) float64 -3.9 -3.88 -3.86 -3.84 -3.82 ... 3.84 3.86 3.88 3.9 So with a step of 0.02s instead. Perhaps doing this, and outputting a warning during MWCS processing is the best solution (e.g. to indicate different step used as user-defined not resolvable with current sampling rate). If sounds good, can make a pull request.

ROBelgium / MSNoise

Possible issue with mwcs_step #252