Open hoechenberger opened 10 months ago
thank you for opening this. I think the docstring is confusing but technically not wrong:
None (default) will pick all channels
does imply that bad channels will be included. but the following sentence is the one that I think creates a misleading impression:
Note that channels in info['bads'] will be included if their names or indices are explicitly provided.
This is meant to say "if you're passing an explicit list of indices or strings to picks
, then note that we don't automatically exclude channels in .info['bads']
in that case" or in other words: "warning, you might accidentally overrule .info['bads']
if you're not careful about which channel names/indices you pass in to picks
"
That is definitely a mouthful and not easy to get from the docstring as it is currently written. So yes, we should try to make it less confusing / ambiguous.
But it also brings up for me a larger issue: as you know the default value of picks=None
gets interpreted differently in different parts of the codebase --- sometimes None
means "good data channels", sometimes "good and bad data channels", sometimes "good channels", and sometimes "good and bad channels" (AKA all channels). Also I think there are flavors like "good data channels except not ref-meg".
I think it would be a big improvement to both the user and developer experiences if the default in all of these cases was a string representing the flavor: i.e., in funcs/methods where we want the default to be "good data" channels, the default should be something like picks="good data"
instead of picks=None
. I.e., if you had seen that the default were picks="all data"
, and known from the docstring (or linked glossary entry?) how to interpret that, i.e., that "all" means "good and bad", then I think this also would have reduced the likelihood of confusion/mistake.
I think it would be a big improvement to both the user and developer experiences if the default in all of these cases was a string representing the flavor
+1 on getting rid of picks=None
in favor of something like picks="data"
. But another option that I expect will be easier to implement and work with existing parts of our code and use cases would be to use the pair of picks
and exclude
to achive "all data" for example as:
picks="data", exclude=()
and "good data" as:
picks="data", exclude="bads"
In other words, we can split up the bad inclusion/exclusion using the exclude
kwarg instead of packing it into picks
as "good data"
. exclude=None
or exclude="bads"
is already used a bunch of places in the codebase:
$ git grep "exclude=\"bads\"" mne | wc -l
170
$ git grep "exclude=None" mne | wc -l
23
IIRC this is what we did recently with the Spectrum-including-bads PRs. Getting these uses to work nicely / as expected with picks="all data"
when we already have exclude
later in the signature I expect to be a lot messier than 1) adding exclude=...
where we currently don't have it and 2) avoiding picks=None
everywhere (replaced by explicit picks=<str>, exclude=None | "bads
use).
... the other advantage of spltting in this way is things like picks="eeg", exclude=...
can work and we don't have to implement stuff like "good eeg"
and "all eeg"
etc.
+1 for @larsoner's proposal, on the assumption that we actually have an exclude
param in all the relevant places (I think we do but haven't checked). And even if not, it feels cleaner to just add exclude
where it's missing (in addition to changing the pick default to "all"
or "data"
) rather than adding handling for a bunch of new strings like "good data" to picks
I'm also +1 for tackling this with picks
and exclude
. Regarding the initial comment, I think we actually used to exclude all bads when picks=None
. When did this change?
I think we actually used to exclude all bads when picks=None. When did this change?
At least from a few years back when we added picks=<str>
support I think it could mean "all channels", "all good channels", "all data channels", or "all good data channels" and which one it meant varied from function to function.
I like @larsoner's proposal
@drammock .get_data() unfortunately doesn't have an exclude
parameter
on the assumption that we actually have an exclude param in all the relevant places (I think we do but haven't checked). And even if not, it feels cleaner to just add exclude where it's missing
.get_data() unfortunately doesn't have an exclude parameter
Indeed there are probably a number of places we'll want to add it. It should be easy enough to add them as we eradicate picks=None
-- we're going to be modifying those signatures anyway so it's easy to check for exclude
at the same time.
Proposed documentation enhancement
Currently, the doctoring for the
picks
parameter of.get_data()
reads:See e.g. https://mne.tools/dev/generated/mne.Epochs.html#mne.Epochs.get_data
I always assumed that bad channels are not included by default (
None
), as I didn't provide their names or indices. But that doesn't seem to be the case.The doctoring should be updated to explicitly state that by default, bads will be included.
This also suggests that we should take a look at our tutorials that make use of
.get_data()
and update them to passpicks="data"
or similar.We may also need to take a look at the decoding steps of MNE-BIDS-Pipeline, which use
.get_data()
, too.Edit: In most (all?) tutorials we run
inst.pick(..., exclude="bads")
, which avoids the issue mentioned above.