lina-usc / pylossless

🧠 EEG Processing pipeline that annotates continuous data
https://pylossless.readthedocs.io/en/latest/
MIT License
20 stars 8 forks source link

Maint fixes #164

Closed scott-huberty closed 1 month ago

scott-huberty commented 1 month ago

This fixes a few things and we'll see if the CI's tell us whether we need to fix a few more.

  1. As of yesterday, torch 2.4 was released, and a new FutureWarning was being triggered in torch by MNE-ICAlabel. I've submitted a fix to MNE-ICALabel already, but in the meantime I'm just filtering out the warning in our tests. Once MNE-ICALabel .7 is released we can remove the pytest.mark.filterwarnings
  2. One of our tests was rightfully throwing a warning because we were downsampling our test file with a non-integer sampling frequency to an integer sampling frequency. I'm honestly surprised our tests weren't failing because of this before! (EDIT: this error was being filtered in our pytest.ini file, but for some reason it is not being respected on the CI's).
  3. Removed some manual installs of openneuro-py and torch since those packages are now listed in requirements_testing.txt
  4. We currently have a pytest.mark.xfail on test_TopoViz, because it was failing due to some issue with chrome driver. The failing test also causes the CI's to hang for a really long time. In the mean time I'm marking this test to be skipped, until we can fix it.
christian-oreilly commented 1 month ago

@scott-huberty I am not sure what to think of _check_sfreq(). It is the first time that I notice this function. I suppose it got added during some bug fix? It seems like a MNE issue rather than a PyLossless issue... We are not used to working with floating point sampling frequency because EGI works with integer sfreq, but many manufacturers give a calibrated sfreq which will normally not match exactly to an integer and can vary slightly from one amp to the next. Normally, the mapping between annotations (in seconds) and events (in samples) for epoching should work as long as the time specified in annotations fit with integers multiple of the sampling frequency and numerical error issues (e.g., due to roundoff) should probably be dealt by MNE rather than us since they chose to support floating point sfreq. Just my two cents. You may be aware of subtlety motivating the need for _check_sfreq() that I am not.

scott-huberty commented 1 month ago

@christian-oreilly It's been a long time but If I recall we worked on this together in a pair programming session during my PhD. What you said above mostly matches up with my understanding of the problem. here are the related issues/PR's

Here is my understanding of what got us here:

  1. The test file we used in test_simulated has a non-integer sfreq
  2. This causes interesting behavior when mapping epoch indices to annotation times. Basically if epoch indice 2 is bad.. after creating an annotation for that and then epoching the raw object, epochs 1-3 will actually be dropped.

I think we discussed, but never implemented, adding an option to the pipeline config or .run signature, like force_integer_sfreq, that the user could change to False if they are okay with the behavior described above.

scott-huberty commented 1 month ago

As far as MNE, I suppose you are right that an issue could be submitted there, but I'm not sure how high priority it will be for them.

christian-oreilly commented 1 month ago

Yeah, I remembered the general issue that this can cause, I just did not remember the _check_sfreq() function and the fact that we solved this by systematically resampling to an integer frequency... which is a lossy operation. Anyway, we can revisit this down the line if someone raises an issue related to the use of the pipeline with floating-point frequencies. For now, to the extent that this approach fixes the issues for the tests, I think it is low priority and fine as-is.

scott-huberty commented 1 month ago

Sounds good 🙏 @christian-oreilly