Introducing a new way of handling image channels

With this change, we can now describe more complex channel representations like (R, G, B, Depth, ...)

The challenge is that there is an infinite number of combinations so we cannot have 1 class/enum for each. AND we don't want the user to import weird objects and to build on top of it.

This leads to some questions

How to represent the whole channels of an image? Which structure to use ? Knowing that we want this to be simple and as intuitive as possible..
For a given representation, how can we convert it to RGB, LAB, GRAY, ... ?

Solution Proposed

String representation

1 letter per channel
Only specific patterns are accepted for color channels:
- RGB
- BGR
- G
- LAB
O should be used as a placeholder for other channels (Depth, ...)

E.g. RGBO for (Red, Green, Blue, Depth)

These strings are later used to instantiate a class that handles all conversion logic.

ImageChannels

These objects take the strings from above and then handle all of the logic of converting to an image of given channel representation to rgb/lab. This abstracts away the logic from the feature extractors.

See how it is used with samples

@dataclasses.dataclass
class ImageSample:
    ...
    image_channels: ImageChannels 

    @property
    def image_as_rgb(self) -> np.ndarray:
        return self.image_channels.convert_image_to_rgb(image=self.image)

    @property
    def image_channels_to_visualize(self) -> np.ndarray:
        return self.image_channels.get_channels_to_visualize(image=self.image)

    @property
    def image_as_lab(self) -> np.ndarray:
        return self.image_channels.convert_image_to_lab(image=self.image)

Example of questions

--------------------------------------------------------------------------------
Please describe your image channels?
--------------------------------------------------------------------------------
Image Shape: (640, 427, 3)

Enter the channel format representing your image:

  > RGB  : Red, Green, Blue
  > BGR  : Blue, Green, Red
  > G    : Grayscale
  > LAB  : Luminance, A and B color channels

ADDITIONAL CHANNELS?
If your image contains channels other than the standard ones listed above (e.g., Depth, Heat), prefix them with 'O'. 
For instance:
  > ORGBO: Can represent (Heat, Red, Green, Blue, Depth).
  > OBGR:  Can represent (Alpha, Blue, Green, Red).
  > GO:    Can represent (Gray, Depth).

IMPORTANT: Make sure that your answer represents all the image channels.

Something I explored but gave up on

At first, I started with a more custom solution. You can enter any channel type in any order. OOROBOG for instance, and then DG would understand which channels are Red, Blue, Green and reorganise. The issue is that, it adds lots of complexity in

The code.
The explanations to user. And at the same time, this really felt like overengineering something that will never be useful.

Final Notes

I know the design is not perfect, I tried to make it as simple as possible while being general enough. I also think I should refine the way questions are being asked. Let me know any thought that comes to your mind

I think it's a good place to start with asking "Why do we ever care about channels layout in images in the first place?"

Perhaps I may miss few use cases, but this is my list of places where we want to have some information on which channel is red, green or blue:

Correct colors and label names in intensity histogram per channel
Average image brightness plot (That requires RGB -> Gray conversion to get a single scalar score per pixel)
Sample visualization

And here are some examples of what input image can be:

RGB image (0..255) and semantic mask in the 4-th channel
In SAR processing a signals of different polarization (VV,VH,HH,HV) are very common. They are usually given in a log-scale and don't have 'direct' RGB representation at all
Near-Infrared + RGB images
Multispectral images (Up to 16 and more channels of different narrow wavelengths including optical range but could be anything from infrared to UV)

Let's assume we have an abstract concept ImageChannelMapping (Name is arbitrary) that we want to encapsulate this knowledge of what each channel represents. Here are the technical requirements for this concept from my perspective:

Get human-friendly names & preferred colors for each channel
Get image of the average intensity (H,W,1) of the input image (H,W,C) (Note this not necessary should be a linear combination of channels, could be some non-linear function as well)
Get RGB visualization for given input image (Including pseudocolor if direct RGB representation is not possible)

class ImageChannelMapping(ABC):
  def get_channel_names_and_colors(self) -> List[Tuple[str, Tuple[int,int,int]]: pass
  def get_mean_intensity_image(self, image:np.ndarray) -> np.ndarray: pass)
  def get_rgb_representation(self, image:np.ndarray) -> np.ndarray: pass

This is the internal concept that DG can use to operate on images.

Now how one would obtain this information

Option 1 (most explicit) - User would have to pass this explicitly via code: Sure we can simplify his life by providing a pre-made templates: ImageChannelMapping.RGB, ImageChannelMapping.Grayscale, etc.

But also if some custom channel scheme is required:

mapping = ImageChannelMapping(
  channel_names=["VV","VH","HH"],                      # This is for plots
  channel_colors=[(255,0,0), (10,255,30), (0,40,255)], # This is for plots
  get_rgb_representation = lambda image: cv2.normalize(image[0] + 0.5 * image[1]),
  get_mean_intensity_image = lambda image: np.mean(image, axis=-1)
)

Option 2 (user input) If image has 3 channels, prompt him is it's RGB, BGR, LAB, YCrCB or something else. If image has 1 channel prompt if it's grayscale or something else If image has other number of channels go to something else. something else: Ask user to give some meaningful names to the channels (Or if he don't want name them Channel_0, Channel_1, etc..) In case of something else we warn that get_rgb_representation and get_mean_intensity_image will be some default implementations (Eg. taking first 3 chanels) and that's it.

Now we can see the custom channels (not RGB, BGR, G). Next step will be to add support for custom names.

Deci-AI / data-gradients

Hotfix/sg 000 fix question display n channel #200