Closed veegalinova closed 4 months ago
Hi @veegalinova,
thanks for the clear description of the issue! I don't have time right now to think about how to cleanly solve this, but just wanted to let you know that we have a workaround: it is possible to set the precision for the test in test_model
via the --decimal
parameter. Setting this e.g. to 2 (default is 4), will probably fix the test. You can also add something to the config
in the rdf so that this is respected by the CI. (I am not quite sure about the details anymore, @FynnBe should know more).
(That being said, it would def. be nice to find a cleaner solution, but in case you want to upload the model now you can use the workaround above.)
Are the --decimal
and config
(CI-wise) documented somewhere on https://bioimage.io/docs/#/? I couldn't find it there, but also might have missed it.
before the normalized data is given to the NN it should be converted to float32 again, see https://github.com/bioimage-io/core-bioimage-io-python/blob/53dfc45cf23351da61e8b22d100d77fb54c540e6/bioimageio/core/prediction_pipeline/_combined_processing.py#L70 (of course this might still be the issue somehow...)
Setting the test precision may be undocumented, here is what you'd have to include for your example:
config:
bioimageio:
test_kwargs:
keras_hdf5:
decimal: 2
the --decimal
flag is listed when insepcting the help:
$ bioimageio test-model -h
bioimageio.spec 0.4.9
implementing:
collection RDF 0.2.3
general RDF 0.2.3
model RDF 0.4.9
+
bioimageio.spec.partner 0.4.9
implementing:
partner collection RDF 0.2.3
bioimageio.core 0.5.9
Usage: bioimageio test-model [OPTIONS] MODEL_RDF
Test whether the test output(s) of a model can be reproduced.
โญโ Arguments โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ * model_rdf TEXT Path or URL to the model resource description file (rdf.yaml) or zipped model. [default: None] [required] โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
โญโ Options โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฎ
โ --weight-format [pytorch_state_dict|torchscript|keras_hdf5|tensorflow_js|tenso The weight format to use. [default: None] โ
โ rflow_saved_model_bundle|onnx] โ
โ --devices TEXT Devices for running the model. [default: None] โ
โ --decimal INTEGER The test precision. [default: 4] โ
โ --help,--version -h Show this message and exit. โ
โฐโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโฏ
Hello! I ran into the issue while trying to run the test_model function on the exported model.
I want to discuss if the following behavior is a conscious choice or more of an oversight.
Sorry in advance for the long explanation ๐
I am trying to export and test the model from an external library (N2V). I have a sample input image with a large mean and std. I normalize it, run the model and then denormalize the model output. The processing portion of my model rdf looks like this:
After running
test_model
function on the exported model, the last test fails with the following error:After some investigation, I believe I was able to localize the issue - there is a slight difference between how the image is being handled in
zero_mean_unit_variance
vs my code.When mean and std values are validated from the model rdf, it is done essentially in this way:
mean = np.array(float(mean_string))
, which yields a float64 value. After runningzero_mean_unit_variance
with this value on a float32 input, it becomes float64.The problem with this behavior is that in my code, and probably many other users' code, the input value will stay the same type - float32 from the beginning to the end of the inference pipeline (although images with this large of the mean are probably rarer) If the mean and std values are large the resulting difference after normalization and denormalization between the output of my pipeline and bioimage core becomes big enough to fail the test.
So potential solutions could be one of the:
data_type
in the model rdf in the bioimage coreHere is the sample code to reproduce the issue. If you pass input and output tensors as type np.float32 the test will fail. If you uncomment the conversion of input and output to np.float64 the test will pass.
Versions:
bioimageio_spec_version: 0.4.9 bioimageio_core_version: 0.5.9