Closed jwodder closed 1 year ago
Patch coverage: 84.89
% and project coverage change: +0.39
:tada:
Comparison is base (
502bf49
) 81.48% compared to head (c03a5ae
) 81.87%.
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.
I just merged a small PR which introduced conflicts, please resolve.
@nipy/team-heudiconv I am currently facing the following blockers to applying type annotations:
I believe I may have found a bug, and I'm not sure how to address it. I've determined that the ses
argument to heudiconv.convert.prep_conversion()
is of type str | int | None
(See explanation below), yet here ses
is passed to sanitize_label()
, which only accepts str
s and will error if given an int
. What should be done about this?
Some of the code in heudiconv/convert.py
extracts fields from BIDS JSON (sidecar?) files, yet I cannot determine what types these fields are supposed to be. Specifically, the fields in question are:
CoilString
— Not found on https://bids-specification.readthedocs.ioEchoNumber
— Not found on https://bids-specification.readthedocs.ioEchoTime
— The BIDS spec's "Glossary of schema objects" lists four fields by that nameImageType
— Not found on https://bids-specification.readthedocs.ioJust what are these fields?
Explanation for why ses
appears to be of type str | int | None
: prep_conversion()
is only called at one point, here, where the ses
argument is set to session
, which is last assigned to here and here.
At the first location, session
is assigned from an element of a key returned by get_study_sessions()
, and all of these keys are of type StudySessionInfo
, and so session
must accept all types defined for StudySessionInfo.session
.
At this location in get_study_sessions()
, StudySessionInfo.session
is initialized from either session_
or session
.
session_
is a "key" from the values returned by get_extracted_dicoms()
and thus is of type Optional[int]
.session
is an argument to get_study_sessions()
, which is only called at two locations, both in heudiconv/main.py
.
workflow()
, session
comes from the argument of the same name to workflow()
, which is of type Optional[str]
.process_extra_commands()
, session
comes from the argument of the same name to process_extra_commands()
, which (despite what the docstring says) is of type Optional[str]
, as process_extra_commands()
is called with the same session
as passed to workflow()
and thus must include all of its types.Therefore, the session
argument to get_study_sessions()
must be of type Optional[str]
.
Therefore, the type of StudySessionInfo.session
is str | int | None
.
At the second location, session
is assigned from session_manual
when it's non-None
, and this variable is equal to the original session
argument passed to workflow()
and thus of type Optional[str]
.
Therefore, the session
variable passed as the ses
argument to prep_conversion()
will be either str | int | None
or str
(i.e., just str | int | None
), and so ses
in prep_conversion()
has the same types.
@nipy/team-heudiconv Also, side question: In this code:
is dcm.pub
supposed to be getattr(dcm, pub)
? That would seem to make more sense.
@nipy/team-heudiconv Yet another issue: In the following code:
the header.get_best_affine()
and header.get_data_shape()
methods are not present on all possible return types of nibabel.load()
; as far as I can tell, they are only present when nibabel.load()
returns an instance of one of the following types:
Nifti1Pair
Nifti1Image
AnalyzeImage
MGHImage
Exactly what type is nb_load(nifti_file)
expected to return here?
@nipy/team-heudiconv This should hopefully be the last typing issue I post: The prov_file
argument to heudiconv.dicoms.embed_metadata_from_dicoms()
can apparently only be str
, yet in this call to the function, the provided prov_file
argument can be either a str
or None
. What should be done about this?
@nipy/team-heudiconv Ping.
First of all thank you @jwodder for going on this types quest!
- I believe I may have found a bug, and I'm not sure how to address it. I've determined that the
ses
argument toheudiconv.convert.prep_conversion()
is of typestr | int | None
(See explanation below), yet hereses
is passed tosanitize_label()
, which only acceptsstr
s and will error if given anint
. What should be done about this?
ses
is in general a "label", not an "index" (see entity table for more info on those) so should be a str
. So I guess the point at which you figured it could come out as an int
should format it to str
one way or another.
Some of the code in
heudiconv/convert.py
extracts fields from BIDS JSON (sidecar?) files,
correct ! called "sidecar" metadata files in BIDS
yet I cannot determine what types these fields are supposed to be. Specifically, the fields in question are:
CoilString
— Not found on https://bids-specification.readthedocs.io
some times we might be "ahead of the standard" here and get a better luck searching directly in https://github.com/bids-standard/bids-specification . This time it leads to unfortunately never finished (still open) https://github.com/bids-standard/bids-specification/pull/425 . From https://github.com/bids-standard/bids-specification/pull/425/files#diff-37df75f7d2d341bed5815aad788fe7bff09374c335eb0c25fab0b7569e8c0a46R2 - just a str
EchoNumber
— Not found on https://bids-specification.readthedocs.io
hm, nothing among PRs. Not standardized within BIDS, but extracted by dcm2niix
❯ git grep EchoNumber
BIDS/README.md:| EchoNumber | | Only multi-echo series | D |
GE/README.md:Current GE software (DV26.0_R03_1831.b) running research multi-echo sequences create invalid DICOM images. The required public [EchoTime (0018,0081)](https://dicom.innolitics.com/ciods/mr-image/mr-image/00180081) attribute lists the shortest echo time for the series, rather than the actual echo time for the given DICOM image. The public tag [EchoNumber (0018,0086)](https://dicom.innolitics.com/ciods/mr-image/mr-image/00180086) reports `1` for all echoes. These limitations in GE's DICOM images disrupt dcm2niix's image conversion. Hopefully future product sequences will generate valid DICOM data. In the meantime, [issue 359](https://github.com/rordenlab/dcm2niix/issues/359) provides a kludge for image conversion.
console/nii_dicom_batch.cpp: fprintf(fp, "\t\"EchoNumber\": %d,\n", d.echoNum);
should be an int
. @pvelasco @tsalo - should we submit a PR for BIDS to formalize it?
EchoTime
— The BIDS spec's "Glossary of schema objects" lists four fields by that name
to say the truth I am not yet sure about plurality of definitions there, but it should be a float
. I don't think we should worry about possible list[float]
here
ImageType
— Not found on https://bids-specification.readthedocs.io
yet another one dcm2niix
extracts from DICOMs, but not standardized in BIDS (not sure if this one should, but also might): https://github.com/rordenlab/dcm2niix//blob/HEAD/BIDS/README.md#global-series-information
Make it list[str]
.
@nipy/team-heudiconv Also, side question: In this code:
is
dcm.pub
supposed to begetattr(dcm, pub)
? That would seem to make more sense.
well spotted!!!
Nifti1Pair
Nifti1Image
AnalyzeImage
MGHImage
Exactly what type is
nb_load(nifti_file)
expected to return here?
given our invocation of dcm2niix I believe it is exclusively Nifti1Image
The
prov_file
argument toheudiconv.dicoms.embed_metadata_from_dicoms()
can apparently only bestr
why? I think it could still be None
if e.g. with_prov
was False and not provided to convert
. So probably should just be str | None
?
@yarikoptic I still need a resolution for the issues mentioned in my top comment.
heudiconv.dicoms.group_dicoms_into_seqinfos()
: Ifgrouping == "custom"
andcustom_grouping
is a callable, the return value is the result of applyingcustom_grouping()
to some arguments; otherwise, the return value is something else. Normally, this could be annotated by using overloads andLiteral["custom"]
, but whenevergroup_dicoms_into_seqinfos()
is called in the heudiconv code,grouping
(if present) is always passed as a variable rather than a string literal, and soLiteral
is unusable here.
ATM it seems that no shipped along heuristic defines custom groupping
:
❯ git grep 'def grouping'
docs/heuristics.rst: def grouping(files, dcmfilter, seqinfo):
and it is somewhat documented at https://heudiconv.readthedocs.io/en/latest/heuristics.html?highlight=grouping#grouping-string-or-grouping-files-dcmfilter-seqinfo and was introduced in https://github.com/nipy/heudiconv/pull/359 . The expectation is that if that is the callable - we get the same dict[SeqInfo, list[str]]
(mapping from SeqInfo to the list of DICOM files) so we should cast result into that I guess. And flatten
seems to be ignored. FWIW, I am not aware of any heuristic which actually used that feature.
- The docstring for
heudiconv.dicoms.create_seqinfo()
states that the first argument is of typenibabel.nicom.dicomwrappers.MosaicWrapper
, yet here the supplied argument (obtained fromvalidate_dicom()
) is just anibabel.nicom.dicomwrappers.Wrapper
.
I think the docstring overspecified and it should be just a Wrapper
.
SeqInfo.series_id
is clearly meant to be astr
, yet there are several places inheudiconv/heuristics/example.py
where this field is compared against or assigned to anint
variable.
example.py also has an explicit str()
of its value... I think it is just generally inconsistent there, as since just an example (ie not really used) -- can be and is likely buggy. I am trying to wrap my head around there...
@yarikoptic Specifying that the custom_grouping
argument to group_dicoms_into_seqinfos()
must return dict[SeqInfo, list[str]]
doesn't solve the problem, as group_dicoms_into_seqinfos()
can return either that or dict[Optional[str], dict[SeqInfo, list[str]]]
depending on the value of flatten
— yet the function is called with a custom_grouping
and flatten=True
here, and it's called with a custom_grouping
and flatten=False
here. The only way for mypy to be sure what type it's getting would be if the values of flatten
, grouping
, and custom_grouping
were all statically known.
but it seems that we can't know them statically and moreover we "violate" it in case of flatten=False
whenever Callable
custom_groupping
is provided. So -- may be for now just comment out the @overload
s you defined for the group_dicoms_into_seqinfos
and add a comment that such clear/strict typing is not needed since flatten
is ignored in case of Callable
custom_groupping
which is expected to return the other kind?
FWIW -- just checked that it would not be a solution since then code which calls group_dicoms_into_seqinfos
would not be able to tell one or another...
I've tuned up that example.py
, fixed another minor bug (can go outside of this PR in principle) and also added that casting
- return custom_grouping(files, dcmfilter, SeqInfo)
+ return cast(Dict[SeqInfo, List[str]],
+ custom_grouping(files, dcmfilter, SeqInfo))
and mypy
seems to be happy for me locally, so may be cast'ing is just enough here ?
hm, on CI (where we have py 3.7) it still fails with following
heudiconv/convert.py:55: error: Cannot find implementation or library stub for
module named "type_extensions" [import]
from type_extensions import TypedDict
^
heudiconv/convert.py:55: note: See https://mypy.readthedocs.io/en/stable/running_mypy.html#missing-imports
heudiconv/convert.py:57: error: Unexpected keyword argument "total" for
"__init_subclass__" of "object" [call-arg]
class PopulateIntendedForOpts(TypedDict, total=False):
^
.tox/typing/lib/python3.7/site-packages/mypy/typeshed/stdlib/builtins.pyi:113: note: "__init_subclass__" of "object" defined here
Found 2 errors in 1 file (checked 46 source files)
typing: exit 1 (14.48 seconds) /home/runner/work/heudiconv/heudiconv> mypy heudiconv pid=1840
.pkg: _exit> python /opt/hostedtoolcache/Python/3.7.16/x64/lib/python3.7/site-packages/pyproject_api/_backend.py True setuptools.build_meta
typing: FAIL code 1 (58.[26](https://github.com/nipy/heudiconv/actions/runs/4812417000/jobs/8567639256?pr=656#step:5:27)=setup[43.78]+cmd[14.48] seconds)
evaluation failed :( (58.40 seconds)
may be we just should go to newer python for type checking?
cool, thanks for fixing! So we are all green -- take out of draft and let's invite others for possible review/feedback/training? @pvelasco @tsalo - interested in reviewing some type annotations ;) ?
@jwodder please check two last small commits where I have (I think) constrained typing a little more - let me know if may be you have reservations against that.
ok -- Let's go. Thank you again @jwodder for all this monumental work!
although indeed internal
I want next release to be minor
version boost at least due to the scale of this change, so labeling it is as such
:rocket: PR was released in v0.13.0
:rocket:
Closes #653.
Problems encountered so far with applying type annotations:
heudiconv.dicoms.group_dicoms_into_seqinfos()
: Ifgrouping == "custom"
andcustom_grouping
is a callable, the return value is the result of applyingcustom_grouping()
to some arguments; otherwise, the return value is something else. Normally, this could be annotated by using overloads andLiteral["custom"]
, but whenevergroup_dicoms_into_seqinfos()
is called in the heudiconv code,grouping
(if present) is always passed as a variable rather than a string literal, and soLiteral
is unusable here.heudiconv.dicoms.create_seqinfo()
states that the first argument is of typenibabel.nicom.dicomwrappers.MosaicWrapper
, yet here the supplied argument (obtained fromvalidate_dicom()
) is just anibabel.nicom.dicomwrappers.Wrapper
.SeqInfo.series_id
is clearly meant to be astr
, yet there are several places inheudiconv/heuristics/example.py
where this field is compared against or assigned to anint
variable.