Samyssmile / edux

EDUX is a developer friendly Java library for machine learning educational tasks
https://samyssmile.github.io/edux/
Apache License 2.0
47 stars 16 forks source link

Extend the Matrix Class by Color Dimension #152

Open Samyssmile opened 9 months ago

Samyssmile commented 9 months ago

Extend Matrix Class to Support Multiple Channels in Image Data

Background

Our current implementation of the Matrix class in the machine learning library is limited to handling image data solely based on rows and columns. This structure implies that each image in a dataset is represented by a single Matrix instance. A critical limitation of this approach is the loss of color information in the images.

Objective

To enhance the representation of image data in our library, it's essential to introduce the concept of "channels" within the Matrix class. This addition will allow us to represent images in a more comprehensive manner, preserving color information.

Specifications

  1. Channels Integration:

    • Modify the Matrix class to include a new property: channels.
    • The channels attribute indicates the depth of color information in an image.
      • For instance, channels=3 would correspond to an RGB (Red, Green, Blue) image, whereas channels=1 would indicate a grayscale image.
  2. Representation of Images:

    • Adapt the image representation so that each image in the dataset is depicted by multiple matrices, corresponding to the number of channels.
    • Specifically, a standard RGB image will be represented by three Matrix instances within the dataset.
  3. Network Learning Adaptation:

    • Ensure that the neural network learns using multiple matrices per image. This change is crucial for capturing the full spectrum of information present in colored images.

Expected Outcome

Implementing this feature will significantly enhance the library's ability to process and learn from colored images, leading to potentially more accurate and nuanced model performances.

For local testing use the fractality 256x256 dataset

Samyssmile commented 7 months ago

After invenstigations, it looks like this one is a block for three Layers requested for CNN.