Closed scott-huberty closed 1 year ago
BTW this is related to #53
Also - the only other time we call variability_across_epochs
is in step 1: flag_outlier_chs
, and this method doesn't even use marks_array2flags
, which further suggests to me that what I pointed out in flag_epoch_ic_sd1
is a bug:
In the Matlab Lossless pipeline ICs are not flagged in the same way that channels are, you are correct. Marking time intervals based on IC time series information should be as follows... give an 'IC x time x epoch' matrix, calculate the SD on the time dimension which returns a 'IC x epoch' matrix. The criteria function then uses this 2D matrix to determine which epochs have too many ICs that have unusual SD values.
thank you, @jadesjardins !!!
I'm going to push some fixes to #53
BTW, I think we have been meaning ping you to discuss the pipeline (i.e. we are unsure what to do in the python version about flagging IC's, because we are not running the final ICA 3 times). Maybe we can discuss more in our all tomorrow.
In the Matlab Lossless pipeline ICs are not flagged in the same way that channels are, you are correct. Marking time intervals based on IC time series information should be as follows... give an 'IC x time x epoch' matrix, calculate the SD on the time dimension which returns a 'IC x epoch' matrix. The criteria function then uses this 2D matrix to determine which epochs have too many ICs that have unusual SD values.
Seems compatible with our doc https://github.com/lina-usc/pylossless/blob/main/docs/source/_images/pipeline_step_12.png Let us know @jadesjardins if you see incompatibilities between what the pipeline is meant to do and what is documented in this page https://pylossless.readthedocs.io/en/latest/implementation.html as these visual representations of each pipeline step are now the official "specs" for the pipeline (i.e., the code should be changed to follow the specs, and never the other way around unless we purposefully decide to change the specs and update these visual representations accordingly)
The purpose of both
pipeline.flag_ch_sd
andpipeline.flag_epoch_ic_sd1
is to flagepochs
where too many channels/IC's are "unlike themselves". i.e. have standard deviations that are above a threshold.Yet....
in
flag_ch_sd
:we calculate the standard deviation across the time dimension (line 887).
That leaves us with an array size n_channels, n_epochs.
Which we pass into
marks_array2flags
(line 897). https://github.com/lina-usc/pylossless/blob/4c3ae7108da82a1985f4548394d472dbbfb7db37/pylossless/pipeline.py#L883-L897in
flag_epoch_ic_sd1
:we call
variability_across_epochs
, and inside that function we calculate the standard deivation across the epoch dimension: https://github.com/lina-usc/pylossless/blob/4c3ae7108da82a1985f4548394d472dbbfb7db37/pylossless/pipeline.py#L1046-L1060https://github.com/lina-usc/pylossless/blob/4c3ae7108da82a1985f4548394d472dbbfb7db37/pylossless/pipeline.py#L358-L359
that leaves us with an array of n_ics, n_times - which we pass into
marks_array2flags
(line 1058).That doesn't make sense to me. As I brought up in #45 , if we want to flag epochs , we need to preserve the epochs dimension.
The current code in
variability_across_epochs
would suggest that we want to flagics
the same way that we do for channels (The channel is above a quantile threshold in too many epochs.... I don't think the MATLAB Lossless pipeline flags IC's this way, it only flags them based off a reliable IC label that is non-brain... But maybe @jadesjardins remembers better).