Closed Andesha closed 1 year ago
LGTM - @christian-oreilly can you review? You've got a better eye on the setup file.
EDIT: thanks @Andesha !
Leaving a note that I left a comment here as it could be easy to miss
After some discussion with @christian-oreilly -
we think we can make some changes that drop the hard dependencies for Iprogress
and Ipywidgets
entirely, and remove any need for that notch_filter if-statement that is tripping things up.
@Andesha If you give me push privileges to your fork I can push some commits directly.
@scott-huberty I think you should be allowed... if not, do you know where I can change it ?
@Andesha Maybe you're right - I'll try to push a commit and see.
If not you would need to add me as a collaborator to your fork. but i'll test pushing a commit first.
invite should be sent
Wow, there are a lot more issues with the readme than I realized.
mne.io.raw.set_montage
for the example file used in the readme, because the tutorial does not instruct the user to provide a montage to the config. and mne needs a montage when it calls raw.set_montage
. @Andesha I'm surprised the pipeline ran for you using this file.Unless I'm missing something so obvious, the problem is this file used a biosemi system yet the montage in that file does not fit any of the biosemi montages provided by MNE. And I don't have access to a custom montage file...
The EGI example is outdated. MNE bids supports reading EGI data and we don't need to use a convoluted work around with tempfile
.
We should only maintain our tutorials in docs/examples
.. Because these are built off jupyter notebooks and we can utilize CI to make sure that they continue to run without errors.
@christian-oreilly - I believe you wrote this tutorial at the beginning of our dev. Given that I can't even get it to work.
I suggest removing the open neuro example, and starting with a fresh example (ideally something already in mne.datasets.sample
or mne.datasets.misc
), and moving it to docs/examples
.
I can help do that in a separate PR. The most important part of this PR IMO is the setup fixes that @Andesha made and i'd like to prioritize getting that merged.
A couple changes -
We changed our import of tqdm
in pipeline.py
so that we don't need the ipywidgets
, Iprogress
dependencies.
I don't think think the pipeline uses scikit-learn
so I removed the dependency.
I tested the pipeline on my local (without the aforementioned packages installed in my environment), and everything worked.
I removed the tutorial from the README because the file used is a bad example. I'll start a fresh example in docs/examples.
I realized I put EDFlib
instead of EDFlib-Python
in the requirements and setup. I corrected this. EDFlib
is a broken package and not the one we want.
Everything else LGTM! Thank you @Andesha !!
I think we are good to merge.
Well, before merging this, I think we need to discuss the README. I am not so keen about you just deleting everything in there. Remember that the README is what is rendered as the homepage of the GitHub repo. Minimal instructions on installing and using the pipeline should be there. More advanced cases can be linked to the doc...
Also, did you move this doc somewhere else or you just deleted it? Yes, indeed, I wrote this. It was working fine, but it is no surprise that it would not work now since this was not part of a CI and the pipeline evolved significantly since then. It should be fixed (and maybe part of it moved elsewhere if we want to keep the readme shorter), but not just deleted.
(When I say "it should be fixed", of course I volunteer to do that since I wrote this originally so I'll have an easier time fixing it.)
Well, before merging this, I think we need to discuss the README
@christian-oreilly the current tutorial exists in docs/examples and can be seen here.
I don't think it is such a big deal to not have a usage tutorial on the GitHub repository page if it exists on the packages website. There is a lot of precedent for this: MNE-Python / MNE-bids / mne-bids-pipeline etc etc.
It's much easier to maintain these tutorials when they can be integrated to CI. This way, failing tutorials don't sit for months until someone points it out to us.
Tutorials in the docs
can be integrated in the CI. Tutorials in the readme can't be.
@christian-oreilly the current tutorial exists in docs/examples and can be seen here.
This looks very nice, @scott-huberty Thanks for looking into that.
I don't think it is such a big deal to not have a usage tutorial on the GitHub repository page if it exists on the packages website. There is a lot of precedent for this: MNE-Python / MNE-bids / mne-bids-pipeline etc etc.
I also don't think it is required. But you more than that has been removed, e.g., the installation bit (which I definitely think should remain). Also, although I definitely don't think a "tutorial" is needed on the front page, an example (as short as possible) showing how to use the package is generally welcome. This can be a 3-4 lines examples. And this example can be added with the test for CI validation. I think making it ridiculously easy to install and run a basic example is what should be targeted; the documentation is for the long explanations with all the gory details once the users are hooked (by the simplicity of the first approach).
No prob, let's circle back on a new Pr to fix that tutorial.
I'll return basic install instructions tomorrow, then let's merge?
P.s. Run the pipeline on a given file in 4 lines of code? I like your style, let's make that the target ;)
Yeah.. something like 1 - import 2 - get default config 3 - run pipeline given a paht 4 - start QC on the output from 3
Yeah.. something like
1 - import
2 - get default config
3 - run pipeline given a paht
4 - start QC on the output from
I don't know if doing that in 4 lines is realistic right now, feel free to push something up though (I would love to be wrong!).
Btw I think we always require the user to assign a montage to the config. we can probably handle this better, in some cases the channel locs may already be in the info structure of the raw
object. I'll bring it up in another ticket.
I don't know if doing that in 4 lines is realistic right now, feel free to push something up though (I would love to be wrong!).
This is a target, not a current status. The reason why the example was long is because the code was not there yet.
Btw I think we always require the user to assign a montage to the config. we can probably handle this better, in some cases the channel locs may already be in the info structure of the
raw
object. I'll bring it up in another ticket.
I am not sure this statement is accurate. I think that in the case of a BIDS dataset, the info about the montage is loaded along with the raw. There is no reason to have anything about the montage in the config in such a case.
This is a target, not a current status.
got it - then we are on the same page!
I am not sure this statement is accurate.
You are right, I was mistaken. We first look for a montage in raw.
I returned the pip install instructions. @christian-oreilly Are you okay with merging?
- I don't think think the pipeline uses
scikit-learn
so I removed the dependency.
We need it for the fastica
call I believe...
- I don't think think the pipeline uses
scikit-learn
so I removed the dependency.We need it for the
fastica
call I believe...
You are right - Sorry for undoing that.
@Andesha @scott-huberty Not a big issue (i.e., probably not necessary to act on this at this time), but this is not really a dependency of pyLossless, but rather a dependency of MNE (and it is part of the requirements for MNE: https://github.com/mne-tools/mne-python/blob/main/requirements.txt). I think it is better in general to let packages we use solve their own dependecies than percolating these into the requirements of pylossless.
@christian-oreilly yeah... but not everything in that requirements.txt
gets installed with pip install mne
. For example pandas
is in there but it is not a dependency of mne.
I think this is one of those cases where MNE uses the package for some functions but they don't include it in their own dependencies (to reduce technical debt). If you don't have it installed when you call their MNE function that uses it they return an error to tell you to install.
We can file a ticket with them but i doubt they will budge. I don't think sci-kit is a dependency of mne.
Sure. I still don't think we should trickle all sub-dependencies directly in pylossless, just the packages we use directly. The packages that gets call by other packages should be managed by these packages. E.g., fastica
is not a hard dependency of the pipeline; it depends on what is specified in the config file. Alternatively, if you like the idea to have a very inclusive install strategy, maybe using a pip install pylossless[all]
and pip install pylossless[minimal]
?
fastica
is not a hard dependency of the pipeline;
That's a very good point. But then again we specify fastica
in the default config.
EDIT: for the first ICA (used to flag epochs).
Yes, indeed. As I said, it is not a big issue. Maybe we can keep this dependency for this specific case because of what we set as default (or change the default config). I just wanted to make the point that I am not sure that, in general, we should take upon ourselves to manage dependencies from external packages, first because it is not our responsibility, second because these external packages may change their dependencies with time which means that we then need to keep up with future changes in dependencies of external packages.
That being said, I am not so clear about the up-to-date best practices in package install in python. Up to recently, I was not distinguishing the requirements.txt vs what packages get installed... and arguments I read for distinguishing these did not convinced me yet. Anyway, you are right that MNE does not install these by default, but only these: https://github.com/mne-tools/mne-python/blob/main/requirements_base.txt
Maybe we can keep this dependency for this specific case because of what we set as default (or change the default config)
yeah, I agree, keep it for now. I think the important thing to keep in mind is that @Andesha tried to run the pipeline with the provided instructions and config and couldn't.
. I just wanted to make the point that I am not sure that, in general, we should take upon ourselves to manage dependencies from external packages,
I agree!
That being said, I am not so clear about the up-to-date best practices in package install in python. Up to recently, I was not distinguishing the requirements.txt vs what packages get installed... and arguments I read for distinguishing these did not convinced me yet. Anyway, you are right that MNE does not install these by default, but only these: https://github.com/mne-tools/mne-python/blob/main/requirements_base.txt
me neither... I think @Andesha gave us his understanding (something like requirements.txt for developers, setup.py for managing hard dependencies).
This PR includes a few different fixes:
requirements.txt
to include all used packagessetup.py
to load dependencies as in #34pip install -e .
within thepylossless
directory after cloning now produces a working/runnable environmentThanks!