lina-usc / pylossless

🧠 EEG Processing pipeline that annotates continuous data
https://pylossless.readthedocs.io/en/latest/
MIT License
25 stars 10 forks source link

Automatically flag IC's based off their IClabel #110

Open scott-huberty opened 1 year ago

scott-huberty commented 1 year ago

As discussed in slack..

MATLAB lossless automatically rejected IC's based off their ICLabel, if their IClabel confidence metric was > 30%. If the metric was < 30%, the component was changed from the label to other.

Scott: " Did MAT lossless reject IC's purely based on their IClabel or did it require replication across the 3 final runs? "

@Andesha: " we flag the replication, but did not reject them based on that "

@Andesha: " if it was 30% or less in its decision, we set it manually to "other" ". " to be clear: 30% or less in ANY of its decisions ". " other was always left in by default " https://github.com/BUCANL/EEG-IP-L/blob/master/code/scripts/s05_concat_data.htb#L532-L554

We can re-implement this in pyLossless.

Also, it would be nice to show the confidence metric value (bt 0-100%) of a component label in the dash Hoverlabel

scott-huberty commented 1 year ago

I'm -0.5 for the manually changing labels made by IClabel to other though.... Instead, I'd suggest:

  1. If the confidence in a label is < 30%, we don't reject it, but we leave it's label the same as IClabel designated it.

and maybe we can let the user set the threshold (default 30%) in the config.

EDIT: If the user wants to change the label, that's a different discussion. I am talking about our pipeline automatically changing a label from channel/muscle/brain to other .

christian-oreilly commented 1 year ago

I see no strong arguments to depart from the previous behavior. What has been proposed or what you are proposing seems a bit like a matter of taste and arbitrary if it is not validated and backed by some experimental data...

If the confidence in a label is < 30%, we don't reject it, but we leave it's label the same as IClabel designated it.

Whether or not this is a good idea could probably be checked in EEG-IP by looking at the proportion of "others" that were not rejected at QC. It would be even better if we can track down the "other" that were set that way because of that 30% rule (if this info is preserved). I am not sure if this investigations is worth the effort and I would not consider it a high priority at this time, but without some check of the sort, the proposed changes seem a bit arbitrary to me.

scott-huberty commented 1 year ago

I also think this would be more effort than it is worth.

I see no strong argument for changing a label..

The important part is if the confidence is <30% the pipeline doesn't flag the IC.

But why intervene on IClabel and change the label to other instead of letting the user see the label that IClabel actually gave?

Andesha commented 1 year ago

You would still be able to see the breakdown of what ICLabel was thinking on one of the QC figures... The motivation is more: "don't reject anything unless you're very sure, and don't call it an artifact unless you're very sure".

If you were to run the same data back through ICLabel you would get the original labels, it's a deterministic process.

scott-huberty commented 1 year ago

You would still be able to see the breakdown of what ICLabel was thinking on one of the QC figures...

But the qc dashboard just reads the labels from a text file and colors accordingly. If we change a label to other before save, the IC will have the other color in the dashboard.

The motivation is more: "don't reject anything unless you're very sure, and don't call it an artifact unless you're very sure".

I agree. Then let's not programattically change labels unless we are very sure that IClabel was incorrect. Given that their algorithm has been peer reviewed twice now, are we sure that our judgment is better than theirs?

If you were to run the same data back through ICLabel you would get the original labels, it's a deterministic process.

True but this is harder to find and requires more work from the user.

AFAIK what pylossless cares about is if the confidence is < 30%. If so, don't add the IC to flags['ics']. We don't need to change any labels to do this. So why do it? It's less transparent (and not consistent with our own design philosophy).

EDIT: I'm honestly surprised that I'm the odd one out here! 😂