Closed synchon closed 5 months ago
PR Preview Action v1.4.7 :---: Preview removed because the pull request was closed. 2024-03-13 08:50 UTC
I think I have an issue with this PR.
1) How is the BOLD_mask
kept even before the PR? I check the fMRIPrepConfoundRemover
and all we do is add it to the extra_input
. But how is this kept???? Is this actually working???
2) Not allowing the preprocessor to "create new datatypes" is a bit strong. Indeed there's no difference between "new data types" and "helper data types".
- How is the
BOLD_mask
kept even before the PR? I check thefMRIPrepConfoundRemover
and all we do is add it to theextra_input
. But how is this kept???? Is this actually working???
Since the input
is not copied for extra_input
, it's kept but if we copy it like we do for markers, it won't be kept.
- Not allowing the preprocessor to "create new datatypes" is a bit strong. Indeed there's no difference between "new data types" and "helper data types".
The reason is because BOLD
or T1w
would be a "data type" for me but not BOLD_mask
. But, we are on the same page with our understanding.
- How is the
BOLD_mask
kept even before the PR? I check thefMRIPrepConfoundRemover
and all we do is add it to theextra_input
. But how is this kept???? Is this actually working???Since the
input
is not copied forextra_input
, it's kept but if we copy it like we do for markers, it won't be kept.
We could have the extra_input
be returned from preprocess
method to make it explicit though. What do you think?
Check this out:
This is how we "add" the BOLD_mask
:
But this is how we treat the input in the _fit_transform
function:
So basically we have out
, which is input
, which is also extra_input
. We then pop
the type
, modify the extra_input
and re-add the type
to out
.
This is madness. My madness, but madness still.
I don't know how to tackle this to make it proper.
Problems we currently have:
1) We are splitting the input
, which is also the out
variable in two, for no reason.
2) We are giving the preprocess
function all the data (input
), even if they do not need it.
3) We are "adding" new "data types" by modifying the extra_input
variable.
A more declarative way would be to:
1) Only pass the input
and the relevant extra_input
to the preprocess function (not all the data)
2) Use the returned
values to update the out
. Maybe by getting a dictionary of key/value pairs?
A more declarative way would be to:
- Only pass the input and the relevant extra_input to the preprocess function (not all the data)
- Use the returned values to update the out. Maybe by getting a dictionary of key/value pairs?
I agree with the steps and it's similar to what I proposed in the earlier comment.
_fit_transform
, we could copy the input
to out
like we do for DataReader so that we don't lose anything.preprocess
we return the first value as the input
dict (as we do now).preprocess
could be "helper data type(s)" dict or None and we check for it in _fit_transform
. If we have it as None, we don't do anything, else we update out
.Perfect!
A more declarative way would be to:
- Only pass the input and the relevant extra_input to the preprocess function (not all the data)
- Use the returned values to update the out. Maybe by getting a dictionary of key/value pairs?
I agree with the steps and it's similar to what I proposed in the earlier comment.
- In the beginning of
_fit_transform
, we could copy theinput
toout
like we do for DataReader so that we don't lose anything.- Then from
preprocess
we return the first value as theinput
dict (as we do now).- The second value from
preprocess
could be "helper data type(s)" dict or None and we check for it in_fit_transform
. If we have it as None, we don't do anything, else we updateout
.
@fraimondo The latest commits should implement this.
Attention: Patch coverage is 96.96970%
with 1 lines
in your changes are missing coverage. Please review.
Project coverage is 88.95%. Comparing base (
6ff3fd3
) to head (5668e63
).
Tests are not passing yet, but code is OK from my side.
Was a network issue for downloading assets, nothing from our side. Let's wait and see.
This PR refactors
preprocess
method of preprocessors to not return "data type" and only return the preprocessed input data. Having to return "data type" forced a concrete preprocessor to only operate on a single "data type" (likeBOLDWarper
introduced in #267) and not allow theon
parameter to be exposed to the user (as requested in #301). If a concrete preprocessor adds new "data type" likeBOLD_mask
in the "junifer data object" asfMRIPrepConfoundRemover
(introduced in #111) does, it will be handled as usual with no changes.A preprocessor should not create new "data types" (which was allowed earlier and hence the restriction) but only create and add "helper data types" like
BOLD_mask
which happens viaextra_input
of thepreprocess
method.