nipreps / dmriprep

dMRIPrep is a robust and easy-to-use pipeline for preprocessing of diverse dMRI data. The transparent workflow dispenses of manual intervention, thereby ensuring the reproducibility of the results.
https://www.nipreps.org/dmriprep
Apache License 2.0
63 stars 24 forks source link

ENH: Data-driven b0 identification tool #107

Closed oesteban closed 4 years ago

oesteban commented 4 years ago

Extracted out from #29. There is a second portion of that one that should be considered in the context of the number of shells estimator.

pull-assistant[bot] commented 4 years ago
Score: 0.44

Best reviewed: with all changes


Optimal code review plan (4 commits squashed)

enh: add a data-driv... ... enh: use median for ...
> Squashed 4 commits: > - enh: add a data-driven *b = 0* estimator > - enh: focus the PR in only one function, add test > - fix(test): the high-b array had too large scale, reducing to preempt f... > - enh: use median for robuster calculation

Powered by Pull Assistant. Last update ad814d9 ... 71973b4. Read the comment docs.

pep8speaks commented 4 years ago

Hello @oesteban, Thank you for updating!

Cheers! There are no style issues detected in this Pull Request. :beers: To test for issues locally, pip install flake8 and then run flake8 dmriprep.

Comment last updated at 2020-04-28 00:33:20 UTC
oesteban commented 4 years ago

Would it be possible to write a test for this with a multishell dataset where the B0s can often have variable gradient values (e.g. 0, 5, 10, 50)?

It would if the intent were to optimize the threshold of the algorithm to ensure it works with a dataset like that. The proposal is testing just whether the manipulation of arrays is appropriate, dimensions, etc (i.e., formal parameters). And that is 100% more tests there were before...

Happy to take more tests in on a separate PR, with more scientific content as you suggest.

dPys commented 4 years ago

Would it be possible to write a test for this with a multishell dataset where the B0s can often have variable gradient values (e.g. 0, 5, 10, 50)?

It would if the intent were to optimize the threshold of the algorithm to ensure it works with a dataset like that. The proposal is testing just whether the manipulation of arrays is appropriate, dimensions, etc (i.e., formal parameters). And that is 100% more tests there were before...

Happy to take more tests in on a separate PR, with more scientific content as you suggest.

"And that is 100% more tests there were before..." Very true.

What about a case where there are nearly as many B0's as DWI volumes? Would be very unusual to encounter this, but it's certainly possible and could skew a purely z-score-based identification?

oesteban commented 4 years ago

but it's certainly possible and could skew a purely z-score-based identification?

I'd say that's certain. I'm just rescuing your valuable code here :D - I think this function can be useful to flag anomalies. That is to say, some false positives may be allowable.

codecov[bot] commented 4 years ago

Codecov Report

Merging #107 into master will increase coverage by 0.27%. The diff coverage is 88.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #107      +/-   ##
==========================================
+ Coverage   51.74%   52.01%   +0.27%     
==========================================
  Files          20       20              
  Lines        1206     1215       +9     
  Branches      160      160              
==========================================
+ Hits          624      632       +8     
- Misses        570      571       +1     
  Partials       12       12              
Impacted Files Coverage Δ
dmriprep/utils/vectors.py 93.10% <88.88%> (-0.23%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update b9c20c4...71973b4. Read the comment docs.

oesteban commented 4 years ago

I'll go ahead and merge this as a base for future developments. I feel it is more important to get @dPys' work merged even if the code is not deeply vetted than sticking with rigorous code review and perhaps block progress, at this point of the project.