Closed pradeeban closed 2 years ago
Hello, I am working on this issue.
ok, thanks for the update.
Hi @pradeeban, in some DICOM files there would be some private tags which the user does not want to extract such as histogram tag shown in the picture below. What I would like to do is extract only the tags whose value length is less than certain threshold or we could ask the user to provide all the tags which he does not want in config.json. Is there any other way to handle this better?
This is a good question. So let me answer in detail.
If you check https://github.com/Emory-HITI/Niffler/blob/dev/modules/png-extraction/ImageExtractor.py, you will see the below lines:
if len(kv)>300:
So, we are ignoring images with more than 300 attributes. Similarly, you could use the first/easy approach you mentioned (extract only the tags whose value length is less than a certain threshold).
You likely will need to add a property in https://github.com/Emory-HITI/Niffler/blob/dev/modules/png-extraction/config.json
"PublicHeadersOnly": true,
The above default will ensure by default only public headers will be extracted. When you set that to false, you will also extract the private headers. This will ensure that we are not bombarding users with private tags all the times (most users do not need those, and that is why we did not have this implemented for long).
Your second option is the ideal scenario. That is how the meta-extraction module handles its extraction. It uses a featureset. See https://github.com/Emory-HITI/Niffler/blob/dev/modules/meta-extraction/conf/featureset.txt
Ideally, we should have both options for the png-extraction. When the featureset is present, get only those listed fields. Otherwise, get everything (as it is now, but without the private tags for now).
Thank you for the answer. I will modify the code such that both the options are available for the user.
Hy @pradeeban , I have solved this issue. Please check it.
@Nitesh639 I have requested changes to your pull request.
@pradeeban I make changes. Check now.
Fixed by @Nitesh639 in Niffler-0.8.5.
Currently, the png extraction module supports extracting only public DICOM attributes. Extending it to support private tags can significantly help research works that depend on those.