Closed lassoan closed 5 years ago
Interesting. Thank you for looking and commenting on this issue. I cannot make time to work on it right now, but I appreciate your PR and will look at it a bit later. If you feel it is urgent and needs to be addressed today, please let me know!
Thank you, I really appreciate the fast response. We forked the repository for the project where we need this urgently, so it is fine if you need a couple of days to review this.
@lassoan the commit you mention does not introduce confidence of 0.5, probably you meant 0.9.
The most important problem is that the main mechanism for grouping frames (examineFiles) was degraded In contrast, examineFilesMultiseries is based on a much weaker hypothesis (it lumps together all series in the selection)
I think, and unfortunately, the definition of "the main mechanism" will depend on the specific combination of equipment/data type/etc. examineFilesMultiseries is there because some manufacturers (and for specific sample datasets) put individual time frames of a 4d acquisition into separate series.
The intent of the commit you mention, if I remember correctly, was to downgrade in significance TemporalPositionIdentifier sort tag, since it was non-specific for the dataset reported by a user, and prevented the results of more specific strategies to be loaded. When users do not toggle Advanced mode, and do not know what to expect, they just load the loadable which is on top.
DICOM plugins in Slicer core should never use confidence value of 1.0 (=absolute certainty) as it prevents more specific plugins in extensions to get highest priority.
I didn't know that, I should change it then.
Commit comment of 1f7b395 does not match what is done in the commit. Which one is correct? The comment or the code change?
I don't know which one is correct ... looks like the commit message would address your second concern, as it would just reduce all confidences by 0.1, but would not change the order of mv loadables by confidence.
That discussion aside, I am not sure I understand how your commit addresses concern 1 (and its description is not consistent with what the code does). The earlier commit downgraded TemporalPositionIdentifier, but just for the multi-series parse strategy. Your current commit does not change that, but downgrades TemporalPositionIdentifier for the main strategy.
Looking at the code again, I do not see how this condition would be satisfied ever:
because TemporalPositionIdentifier is not listed in the prescribedTags here: https://github.com/fedorov/MultiVolumeImporter/blob/1f7b395566cdfa09aecf68a429afff431657d57c/MultiVolumeImporterPlugin.py#L168
I have not looked into this code in a while, and not enough time to finish this with the time I have left...
Maybe in your commit https://github.com/fedorov/MultiVolumeImporter/commit/1f7b395566cdfa09aecf68a429afff431657d57c line 509 was removed unintentionally.
Your current commit does not change that, but downgrades TemporalPositionIdentifier for the main strategy.
Removing setting of confidence means that you set the default (0.5) confidence value. Therefore, setting confidence value to 0.9/1.0 is an upgrade, not a downgrade. I've restored confidence value setting using the same logic that you introduced in examineFilesMultiseries (set it to 0.9/1.0 depending on tagName == 'TemporalPositionIdentifier'). I don't mind tuning the confidence value depending on 'TemporalPositionIdentifier'. I've just used the same logic in for consistency. If you don't think it's needed, we can simply set it to 1.0.
As we support more and more scanners/models/imaging protocols, there is an increased risk that a modification or new feature breaks existing function. We have not added any automatic tests for this importer because these 4D files are big and often we are not allowed to share them. What do you think about creating simplified DICOM files from scratch, with minimal content (using dummy image data of very small size, have only that many tag that allows DICOM database import and multi-volume loading) and creating automatic tests using these?
Oh, I see - I didn't know/forgot about the default 0.5!
Yes, definitely would be very appropriate to add testing and include realistic datasets. For PKModeling we have been working around the image size by cropping small sections of a real DCE multivolume. Perhaps we could explore something similar with the original DICOMs by cropping PixelData and updating Rows/Columns accordingly. This might be easier than composing the datasets from scratch, but I have not tested how that would work in practice. It would also be better if we had as valid as possible DICOM datasets, even if they are "phantoms". Let's get back to this after RSNA?
Yes, either building up a simplified data set or dumbing down an existing one would work (and neither of them is very easy).
Let's get back to this after RSNA?
Sounds good. We can also chat there, if we happen to have time.
Let's also talk with some of the vendors at RSNA about hosting repositories of testing data. I have talked informally with some of them in the past, and there is some interest, but now may be the time when we can make some concrete progress.
vendors at RSNA about hosting repositories of testing data
Are you thinking about (1) having a common repository or (2) having each vendor be responsible to setup a public server ?
Ideally, vendors would be using data pushed on that server (or up-to-date mirror) for their own testing, this would help to keep the dataset current.
That said, since each one of these vendors are (most likely) competing, I am curious to see how this will unfold.
Cc: @thewtex
@jcfr I was thinking of creating a repository of curated testing data that could be mirrored to high-availability / high-bandwidth servers that expose a standard protocol (specifically DICOMweb).
Probably that would take the form of a git-lfs canonical collection and a list of DICOMweb servers that would be auto-populated with the contents of the repository. Then any testing or demonstration code could access the data from whatever server is convenient. Client access should be through DICOMweb so we can query metadata, access individual frames, etc.
It's true the vendors are competing, but standards for data representation and access are generally considered "pre-competitive". Anyone willing to pay the storage and bandwidth charges can have equal rights to host the data.
I agree, let's try again with vendors. Let's brainstorm and put together a short list of potential candidates at RSNA. About competition, the only such issue I can think of is a vendor considering a comprehensive dataset illustrating various platform-specific deviations from DICOM a strategic asset in providing robust analysis/visualization tools. But I hope there are more vendors that would just appreciate a public dataset where everyone could contribute to, and which would make all vendors' products more robust.
Commit https://github.com/fedorov/MultiVolumeImporter/commit/1f7b395566cdfa09aecf68a429afff431657d57c has introduced inconsistent confidence values. This pull request addresses the most important regression, but there are some additional issues to discuss:
In contrast, examineFilesMultiseries is based on a much weaker hypothesis (it lumps together all series in the selection), sets confidence value of 0.9 or 1.0.
DICOM plugins in Slicer core should never use confidence value of 1.0 (=absolute certainty) as it prevents more specific plugins in extensions to get highest priority. You choose very high values (0.9, 0.95, 0.99), but never 1.0.
Commit comment of https://github.com/fedorov/MultiVolumeImporter/commit/1f7b395566cdfa09aecf68a429afff431657d57c does not match what is done in the commit. Which one is correct? The comment or the code change?
This commit only addresses issue (1).