commontk / CTK

A set of common support code for medical imaging, surgical navigation, and related purposes.
https://commontk.org
Apache License 2.0
855 stars 490 forks source link

ENH: Virtualize DICOM database #1154

Closed pieper closed 9 months ago

pieper commented 11 months ago

This generalizes the ctkDICOMDatabase code to ease hard-coded restrictions about DICOM files always being on disk. Now the schema includes a URL field of the SQLite database so certain file-specific operations are only performed on filePaths while URL are handled independently. Entries in the Images table of the database may now have either a URL or a fileName or both and new accessors are provided to get URLs at the series and instance level.

Also this contains a fix to the ctkDICOMTagCache so that tags are always stored internally as uppercase hex, so a string like "0010,000d" will always be turned into "0010,000D" for storage in the database. Both forms can be used to query tags. This avoids the situation where some cache entries were duplicated because different code used either upper or lower case to refer to the same tag. Using only upper case should improve storage use and improve performance by avoiding unneeded cache misses.

pieper commented 10 months ago

This should be merged by squashing so the commit message will be fixed.

jcfr commented 9 months ago

Proposed commit message for squash & merge integration:

ENH: Update DICOM Database with URL Support and Tag Cache Optimization

This generalizes the ctkDICOMDatabase code to ease hard-coded restrictions about
DICOM files always being on disk. Now the schema includes a `URL` field of the
SQLite database so certain file-specific operations are only performed on filePaths
while URL are handled independently. Entries in the `Images` table of the database
may now have either a `URL` or a `fileName` or both and new accessors are provided
to get `URL`s at the series and instance level.

Schema version is updated from version `0.7.0` to `0.8.0`.

Also this contains a fix to the `ctkDICOMTagCache` so that tags are always stored
internally as uppercase hex, so a string like "0010,000d" will always be turned
into "0010,000D" for storage in the database. Both forms can be used to query
tags. This avoids the situation where some cache entries were duplicated because
different code used either upper or lower case to refer to the same tag. Using
only upper case should improve storage use and improve performance by avoiding
unneeded cache misses.

Co-authored-by: Andras Lasso <lasso@queensu.ca>
jcfr commented 9 months ago

@Punzo Do you have any suggestion related to naming and/or approach that would make rebasing of https://github.com/commontk/CTK/pull/1142 easier ?

Punzo commented 9 months ago

@Punzo Do you have any suggestion related to naming and/or approach that would make rebasing of #1142 easier ?

I have already had a videocall with Steve and we agreed on this infrastructure. Once it will be merged, then I will rebase my PR. The idea would be that the url variable values would be like:

protocol+infrastructure://information

examples can be: A) how I will store my "url" for DIMSE in CTK: dimse+ctk://connectionName/studyInstanceUID/seriesInstanceUID/SOPInstanceUID b) and for Steve would be: dicomweb+gcp://IP/studyInstanceUID/seriesInstanceUID/SOPInstanceUID etc...

@pieper please confirm if this is accurate.

Punzo commented 9 months ago

A) how I will store my "url" for DIMSE in CTK: dimse+ctk://connectionName/studyInstanceUID/seriesInstanceUID/SOPInstanceUID b) and for Steve would be: dicomweb+gcp://IP/studyInstanceUID/seriesInstanceUID/SOPInstanceUID

maybe we should add this in somewhere as docs, maybe in the description of https://github.com/commontk/CTK/pull/1154/files#diff-47b3a2aabeea5b49d2da30b70b2dfc96ec78db4fcae06d5cfea45bd0bbe07e67R194 ?

pieper commented 9 months ago

Thanks @Punzo for doublechecking.

I agree on the URL scheme description but since we don't rely on it in the CTK code at this point it's more of an application level thing to define. Yesterday @jcfr , @lassoan , and I talked about moving what I have now in the DICOMLogic repo into the a pure-python package under CTK organization (with a final name TBD) and using that as a place to define higher level usage. We plan to hash out details at Project Week 40. I'll start a project page for this.

lassoan commented 9 months ago

maybe we should add this in somewhere as docs, maybe in the description of

We can amend the documentation in your pull request (that will be the first use of this new database field in CTK). We could describe that dimse+ctk protocol is reserved for the CTK visual DICOM browser, while any other protocols are free to use. When other protocols emerge then we can add them to the documentation as well, to reduce the chance of unintended name clashes.

lassoan commented 9 months ago

@pieper I think it is ready for integration, just need to squash the commits and update the commit message as @jcfr suggested. Maybe also add @Punzo as coauthor.