bioimage-io / spec-bioimage-io

Specification for the bioimage.io model description file.
https://bioimage-io.github.io/spec-bioimage-io/
MIT License
18 stars 16 forks source link

BUG: Calling `save_bioimage_package` twice causes validation error. #619

Closed melisande-c closed 1 month ago

melisande-c commented 1 month ago

Hi, the CAREamics library has been experiencing an issue with the spec library in version 0.5.3 (0.5.2 works fine).

The problem arises when our careamics.model_io.export_to_bmz function, that in turn calls the save_bioimage_package, function is called twice. This is also causing all our tests that export to bmz to fail.

Code snippet The following code snippet will recreate the behaviour, with package versions:

from pathlib import Path

import numpy as np

from careamics import CAREamist
from careamics.config import create_n2v_configuration

# seed random number generator
rng = np.random.default_rng(seed=42)

# example train and validation arrays
train_array = rng.random(size = (32, 32), dtype=np.float32)
val_array = rng.random(size = (32, 32), dtype=np.float32)

# create configuration
config = create_n2v_configuration(
    experiment_name="bioimage_io_testing",
    data_type="array",
    axes="YX",
    patch_size=(8, 8),
    batch_size=2,
    num_epochs=1,
)

# create two directories for outputs
work_dirs = [Path.cwd() / f"Outputs_{i}" for i in range(2)]
# loop through careamics train + export pipeline twice
# model weights, input array & output array are saved in the two different directories
# export will fail on the 2nd time
for work_dir in work_dirs:
    work_dir.mkdir(exist_ok=True)

    # instantiate CAREamist
    careamist = CAREamist(source=config, work_dir=work_dir)

    # train CAREamist
    careamist.train(train_source=train_array, val_source=val_array)

    # export to BMZ
    careamist.export_to_bmz(
        path=work_dir / "model.zip",
        name="TopModel",
        input_array=train_array,
        authors=[{"name": "Amod", "affiliation": "El"}],
        general_description="A model that just walked in.",
    )

This results in the error:

Traceback (most recent call last):
  File "/home/melisande.croft/Documents/Experiments/bioimage_test_failures/error_with_careamics_minimal_example.py", line 39, in <module>
    careamist.export_to_bmz(
  File "/home/melisande.croft/Documents/Repos/careamics/src/careamics/careamist.py", line 705, in export_to_bmz
    export_to_bmz(
  File "/home/melisande.croft/Documents/Repos/careamics/src/careamics/model_io/bmz_io.py", line 186, in export_to_bmz
    save_bioimageio_package(model_description, output_path=path)
  File "/localscratch/miniforge3/envs/careamics/lib/python3.10/site-packages/bioimageio/spec/_package.py", line 223, in save_bioimageio_package
    raise ValueError(
ValueError: Exported package '/home/melisande.croft/Documents/Experiments/bioimage_test_failures/Outputs_1/model.zip' is invalid: ValidationSummary:

|        ❌        |                                      bioimageio validation failed                                      |
|       ---       |                                                  ---                                                   |
| source          | /home/melisande.croft/Documents/Experiments/bioimage_test_failures/Outputs_1/model.zip.unzip/model.zip |
| format version  | model 0.5.3                                                                                            |
| bioimageio.spec | 0.5.3.1                                                                                                |

|  ❓  |                       location                       |                                                                                                                         detail                                                                                                                         |
| --- |                         ---                          |                                                                                                                          ---                                                                                                                           |
| ✔️  |                                                      | initialized InvalidDescr to describe model unknown                                                                                                                                                                                                     |
|     |                                                      |                                                                                                                                                                                                                                                        |
| ❌   |                                                      | bioimageio.spec format validation model 0.5.3                                                                                                                                                                                                          |
| 🔍   | context.perform_io_checks                            | True                                                                                                                                                                                                                                                   |
| 🔍   | context.root                                         | /home/melisande.croft/Documents/Experiments/bioimage_test_failures/Outputs_1/model.zip.unzip                                                                                                                                                           |
| 🔍   | context.known_files./tmp/tmpdvbxmckt/inputs.npy      | b194992d3f13a32fea7472f0fcfb4cf8634895ccc18d98b75c7afd7ba9bc3996                                                                                                                                                                                       |
| 🔍   | context.known_files./tmp/tmpdvbxmckt/outputs.npy     | 0eb281eb5573edb034ce946c6b2ad8b083868d42e058a825411900928cb60651                                                                                                                                                                                       |
| 🔍   | context.known_files./tmp/tmpdvbxmckt/environment.yml | 0d587e51ab7883806a12463ee1b965d3d460464e39221ac99f3e3cbb69cd43cc                                                                                                                                                                                       |
| 🔍   | context.known_files./tmp/tmpdvbxmckt/weights.pth     | 43561a643634978e0a19b1ec221dbe98b602b0c8401517f22b7050d079ca857b                                                                                                                                                                                       |
| 🔍   | context.known_files./tmp/tmpdvbxmckt/config.yml      | dce531f0365caa05a62b399dea46455cd4889fadacf541b2c44de3d0c4b995c4                                                                                                                                                                                       |
| 🔍   | context.known_files.config.yml                       | dce531f0365caa05a62b399dea46455cd4889fadacf541b2c44de3d0c4b995c4                                                                                                                                                                                       |
| 🔍   | context.known_files.inputs.npy                       | b194992d3f13a32fea7472f0fcfb4cf8634895ccc18d98b75c7afd7ba9bc3996                                                                                                                                                                                       |
| 🔍   | context.known_files.outputs.npy                      | 0eb281eb5573edb034ce946c6b2ad8b083868d42e058a825411900928cb60651                                                                                                                                                                                       |
| 🔍   | context.known_files.environment.yml                  | 0d587e51ab7883806a12463ee1b965d3d460464e39221ac99f3e3cbb69cd43cc                                                                                                                                                                                       |
| 🔍   | context.known_files.weights.pth                      | 43561a643634978e0a19b1ec221dbe98b602b0c8401517f22b7050d079ca857b                                                                                                                                                                                       |
| 🔍   | context.known_files./tmp/tmpxz5d0q4j/inputs.npy      | b194992d3f13a32fea7472f0fcfb4cf8634895ccc18d98b75c7afd7ba9bc3996                                                                                                                                                                                       |
| 🔍   | context.known_files./tmp/tmpxz5d0q4j/outputs.npy     | cfda823cc73938b05d19fc70e10c59ef0f0ca770eafb7cb223c8afee227c3e4a                                                                                                                                                                                       |
| 🔍   | context.known_files./tmp/tmpxz5d0q4j/environment.yml | 0d587e51ab7883806a12463ee1b965d3d460464e39221ac99f3e3cbb69cd43cc                                                                                                                                                                                       |
| 🔍   | context.known_files./tmp/tmpxz5d0q4j/weights.pth     | 2d64019525f3e5f9ad46dba3fbdc71e49a572fcf451024a5803e37a2b8888fa8                                                                                                                                                                                       |
| 🔍   | context.known_files./tmp/tmpxz5d0q4j/config.yml      | dce531f0365caa05a62b399dea46455cd4889fadacf541b2c44de3d0c4b995c4                                                                                                                                                                                       |
| 🔍   | context.warning_level                                | error                                                                                                                                                                                                                                                  |
| ❌   | `outputs.0.test_tensor`                              | Value error, Sha256 mismatch for outputs.npy. Expected cfda823cc73938b05d19fc70e10c59ef0f0ca770eafb7cb223c8afee227c3e4a, got 0eb281eb5573edb034ce946c6b2ad8b083868d42e058a825411900928cb60651. Update expected `sha256` or point to the matching file. |
| ❌   | `weights.pytorch_state_dict`                         | Value error, Sha256 mismatch for weights.pth. Expected 2d64019525f3e5f9ad46dba3fbdc71e49a572fcf451024a5803e37a2b8888fa8, got 43561a643634978e0a19b1ec221dbe98b602b0c8401517f22b7050d079ca857b. Update expected `sha256` or point to the matching file. |
| ⚠   | `documentation`                                      | No '# Validation' (sub)section found in README.md.                                                                                                                                                                                                     |
|     |                                                      |                                                                                                                                                                                                                                                        |

As you can see there is a SHA256 mismatch. The second time the train + export pipeline is run the outputs.npy file has the hash cfda823cc73938b05d19fc70e10c59ef0f0ca770eafb7cb223c8afee227c3e4a but it is attempting to compare it with the hash of the first outputs.npy file which is 0eb281eb5573edb034ce946c6b2ad8b083868d42e058a825411900928cb60651.

This is also the issue with the sha mismatch with the weights. Strangely in the careamics tests this error is raised for the inputs.npy file but it does not seem to be a problem in this example.

Summary of careamics.model_io.export_to_bmz

Quick walkthrough of what happens in the careamics.model_io.export_to_bmz function is as follows: 1) A temporary directory is created using python's built-in package tempfile (in the above example the first directory made is /tmp/tmpdvbxmckt/ and the second is /tmp/tmpxz5d0q4j/). 2) The environment.yml file, the inputs.npy file, the outputs.npy file, the config.yml file, and the pytorch state dict file — weights.pth are saved in this temporary directory. 3) The ModelDesc object is created by first creating the InputTensorDesc, OutputTensorDesc, ArchitectureFromLibraryDescr and WeightsDescr objects. 4) The save_bioimageio_package function is called using the just created ModelDesc.

FynnBe commented 1 month ago

Thank you @melisande-c for reporting this issue!

The error seems to result from implicitly reusing parts of the default ValidationContext, which is a bug indeed. A workaround could be to provide a ValidationContext explicitly:

from bioimageio.spec import ValidationContext

for work_dir in work_dirs:
    ...

    # export to BMZ
    with ValidationContext():  # avoid bug where parts of the default `ValidationContext` are being reused ('known_files' in particular)
        careamist.export_to_bmz(
            path=work_dir / "model.zip",
            name="TopModel",
            input_array=train_array,
            authors=[{"name": "Amod", "affiliation": "El"}],
            general_description="A model that just walked in.",
        )

I'll fix the bug in a future release.

FynnBe commented 1 month ago

should be fixed in bioimageio-spec==0.5.3.2, please reopen if not

melisande-c commented 1 month ago

This seems to be working now, thanks!