Closed PierreBou91 closed 1 month ago
Thank you for working on this. As I suspect that we might need some time to review and I was about to bring forward a new release of DICOM-rs, I propose that this is worked on after 0.6.0 is out. This should not stop me from publishing this enhancement in a subsequent minor release anyway.
Oh yeah sure, I think you're right ! I'll check in back here when the 0.6.0 is out.
1. Not tested on multiframe dicom: I am very unfamiliar with multiframe dicoms and I don't really know where to find or create proper ones. I was not able to test this commit with multiframe dicoms. The basic functionality should be preserved but even though, I'm not certain that this is the expected behavior (convert the nth frame of every multiframe dicom in the input). This is by far my biggest concern for this commit because there might be regressions and I don't know of an acceptable way to handle this since the use case (multiframe to image) is unclear to me.
DICOM WG4 has a data set with some example files, and the pydicom test files also have some files which are multi-frame. Running cargo test
on the project already downloads some of these via dicom-test-files
(see for example pydicom/color3d_jpeg_baseline.dcm
).
In practice, I can't be sure which behavior is more suitable, but we can either a) pick the first frame only, or b) print all frames into separate files with an extension to the file name, e.g. abc.dcm
to abc_0.png
, abc_1.png
, abc_2.png
,...
2.
Error
enum with onlyError::Other
variant: Initially I copied theError
enum fromdicom-storescu
required in thecheck_file
function return type that I also imported. But sinceError::Other
is the only variant, maybe should it just be removed altogether ?
Generally, we would go in the opposite direction from here: given this Error
type, we can slowly add more specific error variants and replace context calls accordingly.
3.
output
flag was replaced byout-dir
: This has several consequences:
- The possibility to rename the output file for a single conversion is not retained.
- If several files have the same name in the set to be converted they will be overridden except for the last one being converted.
I just went to the simplest implementation that matched the specs in #343. If the mentioned consequences are not acceptable, I can try and find some ways to fix them.
I admit that this was not made clear in the original issue, but I would prefer that the functionality to convert a single image is retained. It is OK if the code ends up with a bit of redundancy. --out-dir
and -o
should be marked as conflicting with each other, so it's either one or the other.
4. The
format
flag currently takes a String: It works as is but since the image crate is exposing animage
::ImageFormat
enum, this can be a good base for clap's checking and suggestions. I refrained from doing it because I didn't know if every image format should be allowed or any other restriction I don't know about.
That should be fine. At least adding support for extra image formats is as simple as updating image
and its feature set.
Thanks for your guidance @Enet4 !
I made good progress on this PR, but I have an issue and I'm not sure what is the best way to handle it:
To select the image format, it's only necessary to set the extension of the PathBuf to the target extension.
So far, I took the original file name and set the extension but I realized there is an issue with filenames that don't include an extension and include a "." which is unfortunately pretty common. In such cases, methods like file_stem or set_extension don't work correctly since they consider the last portion of the filename (after the last ".") to be an extension so it get lost.
For example given a file f with a filename of "1.2.276.0.123":
set_extension("png")
on f would return "1.2.276.0.png" (losing the last part)file_stem()
on f would return "1.2.276.0" (also losing the last part)Below is a code snippet to illustrate:
use std::{ffi::OsStr, path::PathBuf};
fn main() {
let mut path = PathBuf::from("1.2.276.0.123");
path.set_extension("png");
assert_eq!(path, PathBuf::from("1.2.276.0.png"));
let path = PathBuf::from("1.2.276.0.123");
let stem = path.file_stem().unwrap();
assert_eq!(stem, OsStr::new("1.2.276.0"));
}
As for the workaround, here are the propositions I though of:
We could force the user to pass a filename
command giving some template of the filename based on some dicom tags.
example: --filename "{PatientName}-{SOPInstanceUID}"
This would allow to only have to append the target to this pattern and be done with it.
I implemented a similar logic on an other project and it works fine if the errors (element not found) are correctly handled.
The main downside of this is that it introduces complexity for the user.
We could just append the target extension regardless of the original filename. (I think that's how dcmtk behaves) This would work but we could easily end up with filenames that end in "123.dcm.png" which is not so pretty.
We could force filenames to be something predictable like a timestamp. The downside would be that matching original files with their images is harder.
If someone has an other suggestion as well or any opinion on the best one it would be great.
Thanks
I would detect .dcm
and only replace the extension in this case, extending the file name in all other cases.
Hello,
This should be good.
Multiframe images:
-F
flag to ask for every frame to be converted-F
is passed, every image will have a _n
appended before the extensionError handling:
Single file conversion:
-O
option to rename the single file (excluding the extension that should be set with t
)-O
option will be ignored with a warning if multiple files are passed for conversionFile names:
(n)
before the extension if a file already has the name in the output folder.Please tell me if any other change should be made, I'll gladly oblige.
Hey,
Thanks for the testing and feedbacks!
I'm currently abroad but I'll make sure to make the modifications as soon as I'm back and have a little time. (likely during a weekend of September).
Hey hello, @Enet4
I am having trouble resolving conflicts in a clean manner. So I plan on starting from a new PR directly from master currents version of the toimage
crate. So unless you have a better suggestion, I think we can close this PR.
I just have a quick question. I see that on master's version of main, you have extended the errors and also attributed some specific exit codes. Is there a specification on how exit codes should be attributed or should I just decrease as I add more errors (and group the one that share the same source as I think this is what you've done ?
Thanks !
Pierre
Hello @PierreBou91!
So unless you have a better suggestion, I think we can close this PR.
It is fine either way. It should also be possible to keep the pull request by hard-resetting the branch and doing a forced push.
As for the exit code, it was primarily to differentiate between different kinds of errors, but none of it is intended to follow a stable guideline, so feel free to rearrange them as needed.
I hard reset and force-pushed as advised and reimplemented the bulk conversion.
User can now pass a path to a file, multiple files or a folder (possibly recursively with -r
flag). All the file paths will be collected in a Vec<InOuPair>
and the output path of will be directly computed according to the --output
to chose the name, --ext
to chose the extension and, --dir
to chose an output directory. The previous run()
function is essentially unchanged, just moved in the convert_file
function.
I have not reimplemented the frame selection for multiframe dicoms because I'm not sure yet how to do it with the --unwrap
flag. I'll try to research the fragments because I'm not familiar with it and the vast majority of the dicoms I have tested did not successfully converted with --unwrap
(only the multiframe dicom example worked fine).
Do you think we could see the multiframe part in a different PR ?
Please tell me if you see any improvements to be made. I also intent on doing an improved error handling for the inventory
part.
As per #343 I have implemented the possibility to convert multiple dicoms to a folder of images.
For consistency, I have copied a lot of the logic from the
dicom-storescu
but I don't think it's worth extracting it yet.Here are my questions/concerns about this commit and I need some guidance:
1. Not tested on multiframe dicom: I am very unfamiliar with multiframe dicoms and I don't really know where to find or create proper ones. I was not able to test this commit with multiframe dicoms. The basic functionality should be preserved but even though, I'm not certain that this is the expected behavior (convert the nth frame of every multiframe dicom in the input). This is by far my biggest concern for this commit because there might be regressions and I don't know of an acceptable way to handle this since the use case (multiframe to image) is unclear to me.
2.
Error
enum with onlyError::Other
variant: Initially I copied theError
enum fromdicom-storescu
required in thecheck_file
function return type that I also imported. But sinceError::Other
is the only variant, maybe should it just be removed altogether ?3.
output
flag was replaced byout-dir
: This has several consequences:I just went to the simplest implementation that matched the specs in #343. If the mentioned consequences are not acceptable, I can try and find some ways to fix them.
4. The
format
flag currently takes a String: It works as is but since the image crate is exposing animage
::ImageFormat
enum, this can be a good base for clap's checking and suggestions. I refrained from doing it because I didn't know if every image format should be allowed or any other restriction I don't know about.Overall this commit is far from perfect but I still opened this PR since I already find it more convenient than DCMTK's equivalent binary.