cantaloupe-project / cantaloupe

High-performance dynamic image server in Java
https://cantaloupe-project.github.io/
Other
267 stars 107 forks source link

Question: SCIFIO as Scientific/multi format image Library - Processor? #450

Open DiegoPino opened 3 years ago

DiegoPino commented 3 years ago

Hi,

This is again (sorry) a question:

Cantaloupe is really awesome and core to our platform and implementations. As we move forward with depending more and more on its services, i started to think about the Scientific Community (since we build/develop OpenSource Digital Object Repositories and Data/Lab/Microscopy/Data outputs are a good candidate for deposits/description) and their too many Imaging formats and sources (Medicine too)

The way Cantaloupe manages Processors is nice but I feel extending it for each popular Format could be a bit too much. So I did some research to see what that community is using (and JAVA of course) and found:

and one implementation

that has the capability of using the pluggable Bioformats bringing hundreds of domain specific ones. (e.g https://github.com/openmicroscopy/bioformats)

By default many already supported by Cantaloupe Formats can be already read/written by this libraries (TIFF, JPEG2000, etc)

Note: I have really really little Java experience building high performance systems so this may be non sense. But, I'm a fast learner (normally for me its just a thing of getting myself with the right dev environment/IDE setup) and I have some old times experience with "strange" Image format processing (DICOM, OpenEXR and HDR etc) in C/C++. So the questions is:

This is mostly a discussion opener and it may lead to nothing. @giancarlobi adding you here too since we talked about other Preservation/large image formats(https://fits.gsfc.nasa.gov/fits_primer.html) a time ago and you work in a Scientific Research Center too.

Thanks again. Hope it is not (too) off-topic.

adolski commented 3 years ago

Hi again @DiegoPino,

I agree about the processor design making it hard to add support for new formats. The major to-do for version 5.1 is to migrate codecs into Image I/O plugins and get rid of most of the processors, leaving only perhaps one or two remaining. (Related: #388, #383)

Image I/O is the "official" Java API for image codecs, and it supports all of the features Cantaloupe needs: multiple resolution levels, tiles, random access (via javax.imageio.stream.ImageInputStream), metadata access, etc. A basic read-only plugin only needs to subclass a couple of classes, and around a dozen methods. I'm not familiar with SCIFIO but I assume it could be shimmed into an IIO plugin.

Then, format support would be dictated by the plugins present on the classpath. Of course there would be limitations, like format-native metadata may not work (e.g. using the metadata() delegate method), and other format idiosyncrasies may cause problems. But this could be dealt with in later steps.

DiegoPino commented 3 years ago

As always: thanks a lot @adolski, this is very useful and the time you give to us replying is more than valued.

I see how work on #383 once #388 (exciting!) moves forward in 5.1 could lead to a larger SCIFIO extension support (with us contributing if that is OK) for more than just microscopy ones. I see also the limitations of course but your refactor ideas makes a lot of sense for sure. Normally one (I mean us) would not go for the full spectrum of BioFormats which is simply huge(and some poorly supported)

Not sure how hard that refactor is but again, I will watch those ISSUES closely and we (repo team) may help in anything we can (or are asked for)

Side note: Is there any good tut/guide/hint you would suggest for setting up a dev environment that makes contributing to this project's workflow ideal? No rush. I feel we use it so much but could be better are contributing with a tiny kick in the back. We can also figure that our by ourselves.

Thanks again!