Closed skjerns closed 2 years ago
Hi @skjerns,
Thanks for your feedback :)
Downsampling:
This is extremely worrying and not expected at all. How are you downsampling the data? Are you passing your data as an MNE.Raw obejct or np.array? Could you perhaps try to sequentially disable the thresholds (thresh = {'rel_pow': None, 'corr': 0.65, 'rms': 1.5}
) and run the algo with the original and downsampled data? I want to identify if one threshold in particular leads to this issue, i.e. by not properly taking into account the sampling frequency.
Masking
Is this due to different thresholds being computed for different segments?
Yes, the RMS threshold is calculated on the entire masked signal (e.g. N2+N3 vs N2 only vs N3 only) and thus it is expected that the supra-threshold indices (= potential spindles) would not be the same. Setting the rms threshold to None should however give you the same number of spindles because the two other thresholds are not relative to the masked data.
Thanks, Raphael
Thanks for the quick reply! That's very helpful and good to know :)
Here are the results for the calculation with different thresholds. I'm using a raw
object with a bipolar channel Pz-M2
.
sfreq relp corr rms n_events
'256, 0.2, 0.65, 1.5': 68,
'256, 0.2, 0.65, None': 72,
'256, 0.2, None, 1.5': 513,
'256, 0.2, None, None': 69,
'256, None, 0.65, 1.5': 519,
'256, None, 0.65, None': 144,
'256, None, None, 1.5': 781,
'128, 0.2, 0.65, 1.5': 37,
'128, 0.2, 0.65, None': 37,
'128, 0.2, None, 1.5': 66,
'128, 0.2, None, None': 68,
'128, None, 0.65, 1.5': 123,
'128, None, 0.65, None': 142,
'128, None, None, 1.5': 757
I'm using the following code:
params = itertools.product([256, 128], [0.2, None], [0.65, None], [1.5, None])
res = {}
for sfreq, rel_pow, corr, rms in tqdm(list(params)):
raw_res = raw.copy().pick('Pz-M2')
raw_res.resample(sfreq)
hypno = yasa.hypno_upsample_to_data(hypnogram, sf_hypno=1/30, data=raw_res)
try:
spindles = yasa.spindles_detect(raw_res.pick('Pz-M2'),
hypno=hypno,
include=3,
thresh = {'rel_pow': rel_pow, 'corr': corr, 'rms': rms},
verbose='DEBUG')
res[f'{sfreq}, {str(rel_pow):>4}, {str(corr):>4}, {str(rms):>4}'] = len(spindles.summary())
except:
continue
FYI hypnogram and spectrogram of the participant
Thanks for running the checks so quickly. I need to do a deep dive on this, but it is quite worrying. One observation from your test is that when using only a single threshold (regardless of which one), the number of spindles is roughly the same with 256 or 128 Hz. Therefore, it is only when combining the thresholds that the sampling rate starts to make a difference 🤔 I have a gut feeling that it may be caused by these lines, but I have to check:
One last thing, do you get the same discrepancy when comparing 100 Hz vs 200 Hz (instead of 128 vs 256)?
Thanks, Raphael
Might be going in the right direction... Discrepancy is still there when resampling to 250 or 200 or100 Hz
sfreq relp corr rms n_events
{'250, 0.2, 0.65, 1.5': 67,
'250, 0.2, 0.65, None': 71,
'250, 0.2, None, 1.5': 509,
'250, 0.2, None, None': 70,
'250, None, 0.65, 1.5': 516,
'250, None, 0.65, None': 144,
'250, None, None, 1.5': 782,
'200, 0.2, 0.65, 1.5': 67,
'200, 0.2, 0.65, None': 71,
'200, 0.2, None, 1.5': 507,
'200, 0.2, None, None': 69,
'200, None, 0.65, 1.5': 513,
'200, None, 0.65, None': 141,
'200, None, None, 1.5': 782,
'100, 0.2, 0.65, 1.5': 36,
'100, 0.2, 0.65, None': 36,
'100, 0.2, None, 1.5': 67,
'100, 0.2, None, None': 69,
'100, None, 0.65, 1.5': 115,
'100, None, 0.65, None': 137,
'100, None, None, 1.5': 759}
However, I found something interesting when using different sampling frequencies. There seem to be two attractor states at ~67 and ~510.
Let me know if it would be helpful to receive the original data, I can probably arrange to share it with you privately
sfreq rel_p, corr, rms
'70, 0.2, None, 1.5': 63,
'75, 0.2, None, 1.5': 64,
'80, 0.2, None, 1.5': 63,
'85, 0.2, None, 1.5': 65,
'90, 0.2, None, 1.5': 502,
'95, 0.2, None, 1.5': 521,
'100, 0.2, None, 1.5' : 67,
'105, 0.2, None, 1.5' : 66,
'110, 0.2, None, 1.5' : 510,
'115, 0.2, None, 1.5' : 517,
'120, 0.2, None, 1.5' : 66,
'125, 0.2, None, 1.5' : 67,
'130, 0.2, None, 1.5' : 67,
'135, 0.2, None, 1.5' : 67,
'140, 0.2, None, 1.5' : 67,
'145, 0.2, None, 1.5' : 67,
'150, 0.2, None, 1.5' : 67,
'155, 0.2, None, 1.5' : 67,
'160, 0.2, None, 1.5' : 67,
'165, 0.2, None, 1.5' : 67,
'170, 0.2, None, 1.5' : 67,
'175, 0.2, None, 1.5' : 67,
'180, 0.2, None, 1.5' : 67,
'185, 0.2, None, 1.5' : 67,
'190, 0.2, None, 1.5' : 67,
'195, 0.2, None, 1.5' : 67,
'200, 0.2, None, 1.5' : 507,
'205, 0.2, None, 1.5' : 515,
'210, 0.2, None, 1.5' : 511,
'215, 0.2, None, 1.5' : 517,
'220, 0.2, None, 1.5' : 67,
'225, 0.2, None, 1.5' : 67,
'230, 0.2, None, 1.5' : 67,
'235, 0.2, None, 1.5' : 67,
'240, 0.2, None, 1.5' : 67,
'245, 0.2, None, 1.5' : 67,
'250, 0.2, None, 1.5' : 509,
'255, 0.2, None, 1.5' : 516,
'256, 0.2, None, 1.5' : 513}
Found the bug — I am so glad you catched this... The issue was indeed with the convolution. Using:
w = int(0.1 * sf)
idx_sum = np.convolve(idx_sum, np.ones(w), mode='same') / w
instead of
w = int(0.1 * sf)
idx_sum = np.convolve(idx_sum, np.ones(w) / w, mode='same')
solves the bug and gives similar results regardless of the sampling rate. You can see an example in the notebook in https://github.com/raphaelvallat/yasa/pull/55
I need to do more tests on this and I'll try to release a new version ASAP. My understanding is that the two attractors that you describe refers to the number of thresholds that were detected: 510 = at least one of the two methods above threshold, 67 = both methods above threshold.
I have ran more tests, and indeed the error is caused by a floating point error when using np.ones(w) / w
instead of dividing by w
after the convolution... such a sneaky bug! See below:
bad = np.convolve(idx_sum, np.ones(w) / w, mode='same')
good = np.convolve(idx_sum, np.ones(w), mode='same') / w
np.allclose(bad, good) # Returns True
# Now check the number of samples above soft threshold
np.where(bad > 2)[0].size # 19533
np.where(good > 2)[0].size # 12528
# Another way to fix is just to add a very small number to 2
np.where(bad > 2.00000001)[0].size # 12528
I'll work on a new release ASAP. This was the only function that used np.convolve
so the bug only concerns the yasa.spindles_detection function.
New release of YASA: https://github.com/raphaelvallat/yasa/releases/tag/v0.6.1
Perfect, thanks! Seems to be fixed now indeed :)
On a related note: Would you expect the results to change significantly when applying different HP filters? Should the spindles not be detected in the spindle range only?
I've seen that setting the HP to higher values yields far more spindle events. How should I interpret this? I do understand that in a natural setting you'll never se anything higher than 0.5Hz-HP (especially in N3, as in this example), just trying to understand the factors that affect the found spindles
Found in N3
HP LP n_events
{'0.0, 100': 39,
'0.1, 100': 39,
'0.2, 100': 39,
'0.3, 100': 39,
'0.4, 100': 40,
'0.5, 100': 39,
'0.6, 100': 39,
'0.7, 100': 38,
'0.8, 100': 38,
'0.9, 100': 39,
'1.0, 100': 40,
'1.1, 100': 43,
'1.2, 100': 45,
'1.3, 100': 50,
'1.4, 100': 54,
'1.5, 100': 61,
'1.6, 100': 64,
'1.7, 100': 67,
'1.8, 100': 68,
'1.9, 100': 74,
'2.0, 100': 80,
'2.1, 100': 95,
'2.2, 100': 107,
'2.3, 100': 116,
'2.4, 100': 131,
'2.5, 100': 148,
'2.6, 100': 158,
'2.7, 100': 172,
'2.8, 100': 185,
'2.9, 100': 203}
Found in N2
HP LP n_events
{'0.0, 100': 813,
'0.1, 100': 813,
'0.2, 100': 814,
'0.3, 100': 813,
'0.4, 100': 813,
'0.5, 100': 814,
'0.6, 100': 812,
'0.7, 100': 816,
'0.8, 100': 815,
'0.9, 100': 816,
'1.0, 100': 824,
'1.1, 100': 832,
'1.2, 100': 840,
'1.3, 100': 846,
'1.4, 100': 853,
'1.5, 100': 863,
'1.6, 100': 871,
'1.7, 100': 886,
'1.8, 100': 892,
'1.9, 100': 900,
'2.0, 100': 916,
'2.1, 100': 932,
'2.2, 100': 950,
'2.3, 100': 964,
'2.4, 100': 973,
'2.5, 100': 986,
'2.6, 100': 1006,
'2.7, 100': 1018,
'2.8, 100': 1033,
'2.9, 100': 1038}
HI @skjerns,
First, the data is filtered between the frequencies defined in freq_broad
(default = 1-30 Hz):
Then, the sigma power, relative to freq_broad
is calculated and used as a detection threshold — therefore high-pass filtering may change (increase) the relative power value here.
Next, the moving correlation threshold also uses the broadband signal. Here again, I would expect that a highpass filter > 1 Hz would increase the correlation between the sigma-filtered and the broadband signal.
By contrast, the moving RMS threshold does not use the broadband signal, so the high-pass filter should have no impact there.
In summary, 2 / 3 thresholds rely on the broadband signal. I think that removing the low frequencies from the broadband signal may increase the relative sigma power and moving correlation, and therefore result in a more liberal detection. I expect that the highpass filter should have little or no impact on the moving RMS threshold.
Hope this makes sense!
Thanks, Raphael
Thanks for creating such a cool tool!
I have some comprehension question: In my intuition, downsampling from 256 to 128 Hz should not reduce the found spindle counts significantly. However in my case it goes from 69 found spindles to 38 (in this case limited to N3).
Is this to be expected? Which results should be regarded as trustful?
256 Hz 128 Hz
Additionally, we have the problem that the spindle count does not sum up when analyzing N2+N3 together and when analyzing N2 and N3 separately. Is this due to different thresholds being computed for different segments? Or due to spindles being found that are on the edge between two windows?
Example: