CAREamics / careamics

A deep-learning library for N2V and friends
https://careamics.github.io/
BSD 3-Clause "New" or "Revised" License
29 stars 5 forks source link

export_to_bmz [BUG] #182

Closed drchrisch closed 2 months ago

drchrisch commented 3 months ago

Describe the bug I tried the simple n2v example "Basic CAREamics usage" from [https://careamics.github.io/0.1/guides/careamist_api/]. This worked just as expected. Then, I wanted to save the generated model using the "export_to_bmz" function and ended up with lots of error messages.

To Reproduce Simply run the code from "Basic CAREamics usage" and add

careamist.export_to_bmz( path="n2v_models", name="n2v_model_example.bmz", input_array=train_data, authors=[{ "name": "nobody", "email": "nobody@nobody", }] )

Expected behavior I was expecting to get a zip (or bmz) file containing model for further use.

Environment:

Following instructions from [https://careamics.github.io/0.1/installation/]: mamba create -n careamics python=3.10 mamba activate careamics mamba install pytorch torchvision pytorch-cuda=11.8 -c pytorch -c nvidia

jdeschamps commented 3 months ago

Hi @drchrisch,

Thanks for the feedback!!

path is the parameter that will control the name of the zip file. In your example, it would yield n2v_models.zip.

Error

For completeness here is the error that you probably encountered:

pydantic_core._pydantic_core.ValidationError: 2 validation errors for bioimage.io model specification
name
  Value error, 'n2v_model_example.bmz' is not restricted to 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_- ()' [type=value_error, input_value='n2v_model_example.bmz', input_type=str]
    For further information visit https://errors.pydantic.dev/2.7/v/value_error
authors.0.email
  value is not a valid email address: The part after the @-sign is not valid. It should have a period. [type=value_error, input_value='nobody@nobody', input_type=str]

Reason

Pydantic is the validation library used in both CAREamics and the BMZ core library. What Pydantic tells us here is that there are two parameters that are wrong:

Indeed, we see that in your code snipper, name=n2v_model_example.bmz (which has a forbidden sign, namely .) and "email": "nobody@nobody" (which is missing a dot after the @-sign).

Fix

By fixing both these parameters, the code should run:

import numpy as np
from careamics import CAREamist
from careamics.config import create_n2v_configuration

# create a configuration
config = create_n2v_configuration(
    experiment_name="n2v_2D",
    data_type="array",
    axes="YX",
    patch_size=[64, 64],
    batch_size=1,
    num_epochs=1,  # (1)!
)

# instantiate a careamist
careamist = CAREamist(config)

# train the model
train_data = np.random.randint(0, 255, (256, 256)).astype(np.float32)  # (2)!
careamist.train(train_source=train_data)

# once trained, predict
pred_data = np.random.randint(0, 255, (128, 128)).astype(np.float32)
predction = careamist.predict(source=pred_data)

careamist.export_to_bmz( 
    path="n2v_models", 
    name="n2v_model_example_bmz", 
    input_array=train_data, 
    authors=[{ "name": "nobody", "email": "nobody@nobody.nb", }]
)

Note that the line with the # (2)! has now .astype(np.float32, otherwise you will run into another error.

Documentation

This example was added to the FAQ and a comment has been added on the name parameter in the export to BMZ page.

We will update the method documentation in the library itself.

drchrisch commented 3 months ago

Hi @jdeschamps,

Thanks for your comment. I modified the input to the export_to_bmz function:

careamist.export_to_bmz( path="n2v_models/n2v_2D_example", name="example", input_array=pred_data, authors=[{ "name": "nobody", "email": "nobody@nobody.nobody", }] )

That worked. However, I got a file "n2v_2D_example" with no extension and a directory "n2v_2D_example.unzip". The name variable is only used in the bioimageio.yaml file. Is that the intended behavior? Could the name variable also (internally?) be added to the filename?

jdeschamps commented 3 months ago

So the reason for the parameters being what they are (path the path to the file in which to save the model, and name as the BMZ model spec name) is for consistency with the BMZ specs.

That the file would not have extension is not normal, so there is indeed a bug!

The .unzip is because CAREamics tests the validity of the zip after exporting it, a behaviour we might remove in the future. When testing the validity, the BMZ core library unpacks the model zip.

I made a proposal to fix the bug you encountered and improve the user experience:

That means that if you'd do:

careamist.export_to_bmz(
    path="n2v_models/n2v_2D_example",
    name="example",
    input_array=pred_data,
    authors=[{
        "name": "nobody",
        "email": "nobody@nobody.nobody",
    }]
)

You would get the following file: n2v_models/n2v_2D_example/example.zip.

drchrisch commented 3 months ago

Very good.

Now, having successfully exported the model to ""n2v_2D_example.zip", I wanted to load the model. Unfortunately, this throws an error:

path_to_model = "n2v_2D_example.zip" careamist = CAREamist(path_to_model)

No working directory provided. Using current working directory: ***.


ConstructorError Traceback (most recent call last) Cell In[16], line 2 1 path_to_model = "n2v_2D_example.zip" ----> 2 careamist = CAREamist(path_to_model)

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\careamics\careamist.py:175, in CAREamist.init(self, source, work_dir, experiment_name, callbacks) 169 self.model = CAREamicsModule( 170 algorithm_config=self.cfg.algorithm_config, 171 ) 173 # attempt loading a pre-trained model 174 else: --> 175 self.model, self.cfg = load_pretrained(source) 177 # define the checkpoint saving callback 178 self._define_callbacks(callbacks)

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\careamics\model_io\model_io_utils.py:40, in load_pretrained(path) 38 return _load_checkpoint(path) 39 elif path.suffix == ".zip": ---> 40 return load_from_bmz(path) 41 else: 42 raise ValueError( 43 f"Invalid model format. Expected .ckpt or .zip, got {path.suffix}." 44 )

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\careamics\model_io\bmz_io.py:224, in load_from_bmz(path) 221 config_path = unzip_path / config_path 223 # load configuration --> 224 config = load_configuration(config_path) 226 # create careamics lightning module 227 model = CAREamicsModule(algorithm_config=config.algorithm_config)

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\careamics\config\configuration_model.py:551, in load_configuration(path) 546 if not Path(path).exists(): 547 raise FileNotFoundError( 548 f"Configuration file {path} does not exist in " f" {Path.cwd()!s}" 549 ) --> 551 dictionary = yaml.load(Path(path).open("r"), Loader=yaml.SafeLoader) 553 return Configuration(**dictionary)

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\yaml__init__.py:81, in load(stream, Loader) 79 loader = Loader(stream) 80 try: ---> 81 return loader.get_single_data() 82 finally: 83 loader.dispose()

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\yaml\constructor.py:51, in BaseConstructor.get_single_data(self) 49 node = self.get_single_node() 50 if node is not None: ---> 51 return self.construct_document(node) 52 return None

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\yaml\constructor.py:60, in BaseConstructor.construct_document(self, node) 58 self.state_generators = [] 59 for generator in state_generators: ---> 60 for dummy in generator: 61 pass 62 self.constructed_objects = {}

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\yaml\constructor.py:408, in SafeConstructor.construct_yaml_seq(self, node) 406 data = [] 407 yield data --> 408 data.extend(self.construct_sequence(node))

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\yaml\constructor.py:129, in BaseConstructor.construct_sequence(self, node, deep) 125 if not isinstance(node, SequenceNode): 126 raise ConstructorError(None, None, 127 "expected a sequence node, but found %s" % node.id, 128 node.start_mark) --> 129 return [self.construct_object(child, deep=deep) 130 for child in node.value]

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\yaml\constructor.py:129, in (.0) 125 if not isinstance(node, SequenceNode): 126 raise ConstructorError(None, None, 127 "expected a sequence node, but found %s" % node.id, 128 node.start_mark) --> 129 return [self.construct_object(child, deep=deep) 130 for child in node.value]

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\yaml\constructor.py:100, in BaseConstructor.construct_object(self, node, deep) 98 constructor = self.class.construct_mapping 99 if tag_suffix is None: --> 100 data = constructor(self, node) 101 else: 102 data = constructor(self, tag_suffix, node)

File ~\AppData\Local\miniforge-pypy3\envs\careamics\lib\site-packages\yaml\constructor.py:427, in SafeConstructor.construct_undefined(self, node) 426 def construct_undefined(self, node): --> 427 raise ConstructorError(None, None, 428 "could not determine a constructor for the tag %r" % node.tag, 429 node.start_mark)

ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:python/object/apply:numpy.core.multiarray.scalar' in "n2v_2D_example.zip.unzip\config.yml", line 31, column 5

I guess, you have a solution for this as well?

jdeschamps commented 3 months ago

Thanks so much for persevering in spite of the errors!!

Somehow the means and stds are saved in the configuraiton as numpy.float32, which the yaml export/import is not happy with. Let me check what the fix would be!

EDIT: The problem is the update done through DataConfig.set_means_and_stds, which bypasses the Pydantic casting from np.float32 to float.