CAREamics / careamics

A deep-learning library for N2V and friends
https://careamics.github.io/
BSD 3-Clause "New" or "Revised" License
25 stars 2 forks source link

N2V convenience config function channel axis constraints [BUG] #159

Open melisande-c opened 3 weeks ago

melisande-c commented 3 weeks ago

Describe the bug create_n2v_configuration does not allow specification of axes with C dim of size 1. It is totally possible to have data with dimensions (S, 1, Y, X) and it means users have to do an extra step of removing the C channel from their data.

To Reproduce Code snippet allowing reproducing the behaviour:

from careamics import CAREamist
from careamics.config import create_n2v_configuration

config = create_n2v_configuration(
    experiment_name="Demo", 
    data_type="array",
    axes="SCYX",
    patch_size=[8, 8],
    batch_size=1,
    num_epochs=1,
    n_channels=1,
)

results in error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/melisande.croft/Documents/Repos/careamics/src/careamics/config/configuration_factory.py", line 484, in create_n2v_configuration
    raise ValueError(
ValueError: Number of channels must be specified when using channels (got 1 channel).

The error is raised from: https://github.com/CAREamics/careamics/blob/0a29ea2da44b3745a33bedb93143f4ca6b98a9da/src/careamics/config/configuration_factory.py#L482-L486

Expected behavior Should be allowed to specify 1 channel. The convenience functions could have n_channels: Optional[int]=None instead.

Additional context This might happen in all the convenience functions not just create_n2v_configuration.

jdeschamps commented 3 weeks ago

Yes, that's an interesting point.

The reason for writing the convenience function like this was that this is very unlikely in "real" data. By real data, I mean data that users acquired on a microscope. Singleton dimensions are not really a thing in bioimages, I've only seen it when people are converting their data for DL pipelines (which is not needed in CAREamics)... I prefer covering the very likely case of people having channels and not specifying the number, rather than an edge case.

But if we can come up with an elegant way to cover that case and still have a meaningful way to warn users that they should specify the number of channels, then we should indeed do it!

melisande-c commented 3 weeks ago

I think I encountered this when I was writing a test that parametrised the number of channels, and this error made implementing it slightly more annoying. I can see scenarios where preprocessing of data might add a singleton channel axis and also cases where a custom file type and read function has a singleton channel axis.

I think we can implement it by having the default value of n_channels be None. So we can do:

if ("C" in axes) and (n_channels is None): 
    raise ValueError( 
        f"Number of channels must be specified when using channels." 
    )
else:
    n_channels = 1