DIAGNijmegen / rse-panimg

Conversion of medical images to MHA and TIFF.
Apache License 2.0
13 stars 5 forks source link

Allow non-pathology tiff file conversion #45

Open HarmvZ opened 2 years ago

HarmvZ commented 2 years ago

Enface ophthalmology images are regularly exported from the machines as .TIF files. When these are handled by panimg they will be interpreted as pathology images. It would be nice if we could somehow differentiate between "normal" and "pathology" TIFF files and convert the "normal" TIFF files to MetaIO format so that handling of these files works correctly on grand-challenge and in the workstations.

@miriam-groeneveld do you know if there are some special headers in the pathology TIFF files that we can use to recognize these? I will check the same for the ophthalmology images.

jmsmkn commented 2 years ago

Not to derail this but there is also a problem with uploading pngs and jpegs and they are always forced to mha.

Rather than using a heuristic to determine this that it becomes an option uses can set on the upload session, that gets passes through to panimg. Something like preferred_output=Enum[None, MHA, TIFF], where a default of None represents the strategies choice (current situation). The TIFF and fallback strategies would then need to be updated to produce MHA or TIFF, and decide on their default. The Upload Session model, forms and API endpoints would need to be updated so that the user can override the default of None.

Then as a stretch the things that use upload sessions (Archives, Algorithms, Reader Studies) could set a default for when these are used in a form or API endpoint. This stretch would be a lot more work though, I don't really know if that is really worth it.

Just an idea, we used to get some complaints about this and I'm unsure about how big of an issue it is now.

pkcakeout commented 2 years ago

Let's derail it! :D

Not to derail this but there is also a problem with uploading pngs and jpegs and they are always forced to mha.

Right now, if we would not force this, CIRRUS would not be able to read those in. Also, I think it is still nice that users only ever have to deal with a single file-format, right? Mha is perfectly happy encodings pngs, tiffs, or jpegs (even if their file size becomes bigger). Once we open the flood-gates for other file formats we need to support all those things - and I am lazy. I do not want to complicate things :D SimpleITK is simple, right?

jmsmkn commented 2 years ago

That's not what I'm proposing! The outputs would still be .mha OR .tiff only.

CIRRUS handles both, but the algorithm developers usually only write their algorithms to handle one. I guess that they should handle both, especially when they're used in pipelines, but they don't and you end up with both the problems that Harm and I describe.

pkcakeout commented 2 years ago

Ooooooh... sorry. Yes. I get it now! I thought that the problem we are discussing was only concerned with manual uploads. I forgot that this is relevant for algorithms, too (even though you wrote that James).

Tbh, even though it would be nice if algorithms support both formats, it is kind of asking a lot if we demand a 3D CT algorithm to also ingest multi-resolution tiff files, which the pathology stuff is mainly coded for.

Anyway, a heuristic for this could likely work quite well already imo. If the tiff file is smaller than - say 10000x10000 pixels, that could just become a mha file afterall. Or probably even better to limit it on required memory size: dimensions * datatype size. Anyway, I will leave it to you ;-)

jmsmkn commented 2 years ago

Yes, it depends™, for MIDOG (archive and algorithms) they're using smaller patches of around 1k x 1k, no pyramid, and still want tiff, which is why I think a heuristic is tricky to get right. Could also lead to confusion if some of your data import is one format, and some the other. Options I prefer, but it is more work and maintenance to hook them up to grand challenge though so 🤷‍♂️