bioimage-io / spec-bioimage-io

Specification for the bioimage.io model description file.
https://bioimage-io.github.io/spec-bioimage-io/
MIT License
18 stars 17 forks source link

[proposal]: Remove version specification in ImJoy plugin. Original title: Get string index error presumably from a bad `oid` in `spec/rdf/v0_2/schema.py` #533

Closed jmetz closed 1 year ago

jmetz commented 1 year ago

While working on a new uploader, getting stuck at the validation step on a model I downloaded from the zoo a few days ago:

Error:

Plugin BIO-RDF-Validator_88xme1698061811072: 
Validation results:
 {'bioimageio_spec_version': '0.4.8', 
'error': 'string index out of range', 
'name': 'bioimageio.spec static validation of RDF ', 
'nested_errors': {}, 
'source_name': 
'{name: NucleiSegmentationBoundaryModel, ...}', 'status': 'failed', 'traceback': [ ... see below ... 
File "/lib/python3.10/site-packages/bioimageio/spec/[commands.py](http://commands.py/)", line 93, in validate\n raw_rd = load_raw_resource_description(rdf_source, update_to_format="latest" if update_format else None)
File "/lib/python3.10/site-packages/bioimageio/spec/io_.py", line 221, in load_raw_resource_description\n raw_rd = schema.load(data)
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 722, in load\n return self._do_load(
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 861, in _do_load\n result = self._deserialize(
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 664, in _deserialize\n value = self._call_and_store(
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 492, in _call_and_store\n value = getter_func(data)
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 657, in getter\n return field_obj.deserialize(
File "/lib/python3.10/site-packages/marshmallow/[fields.py](http://fields.py/)", line 366, in deserialize\n output = self._deserialize(value, attr, data, **kwargs)
File "/lib/python3.10/site-packages/marshmallow/[fields.py](http://fields.py/)", line 784, in _deserialize\n result.append(self.inner.deserialize(each, **kwargs))
File "/lib/python3.10/site-packages/marshmallow/[fields.py](http://fields.py/)", line 366, in deserialize\n output = self._deserialize(value, attr, data, **kwargs)
File "/lib/python3.10/site-packages/bioimageio/spec/shared/[fields.py](http://fields.py/)", line 214, in _deserialize\n return super()._deserialize(value, attr, data, partial, **kwargs)
File "/lib/python3.10/site-packages/marshmallow/[fields.py](http://fields.py/)", line 669, in _deserialize\n return self._load(value, data, partial=partial)
File "/lib/python3.10/site-packages/marshmallow/[fields.py](http://fields.py/)", line 652, in _load\n valid_data = self.schema.load(value, unknown=self.unknown, partial=partial)
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 722, in load\n return self._do_load(
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 861, in _do_load\n result = self._deserialize(
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 664, in _deserialize\n value = self._call_and_store(
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 492, in _call_and_store\n value = getter_func(data)
File "/lib/python3.10/site-packages/marshmallow/[schema.py](http://schema.py/)", line 657, in getter\n return field_obj.deserialize(
File "/lib/python3.10/site-packages/marshmallow/[fields.py](http://fields.py/)", line 367, in deserialize\n self._validate(output)
File "/lib/python3.10/site-packages/marshmallow/[fields.py](http://fields.py/)", line 267, in _validate\n self._validate_all(value)
File "/lib/python3.10/site-packages/marshmallow/[validate.py](http://validate.py/)", line 78, in call\n r = validator(value)
File "/lib/python3.10/site-packages/bioimageio/spec/rdf/v0_2/[schema.py](http://schema.py/)", line 36, in <lambda>\n lambda oid: all(oid[idx] == "-" for idx in [4, 9, 14]),
File "/lib/python3.10/site-packages/bioimageio/spec/rdf/v0_2/[schema.py](http://schema.py/)", line 36, in <genexpr>\n lambda oid: all(oid[idx] == "-" for idx in [4, 9, 14]),\n']

Presumably it would be better to catch that the oid is malformed and report that (assuming that's the cause)?

jmetz commented 1 year ago

Update: running locally seems to be fine on the same model:

python bioimageio/spec/ validate rdf.yaml

./spec-bioimage-io/bioimageio/spec/shared/_resolve_source.py:435: CacheWarning: found cached /tmp/$USER/bioimageio_cache/https/raw.githubusercontent.com/bioimage-io/bioimage.io/main/site.config.json. Skipping download of https://raw.githubusercontent.com/bioimage-io/bioimage.io/main/site.config.json.
  warnings.warn(f"found cached {local_path}. Skipping download of {uri}.", category=CacheWarning)
./spec-bioimage-io/bioimageio/spec/shared/_resolve_source.py:435: CacheWarning: found cached /tmp/$USER/bioimageio_cache/https/bioimage-io.github.io/collection-bioimage-io/collection.json. Skipping download of https://bioimage-io.github.io/collection-bioimage-io/collection.json.
  warnings.warn(f"found cached {local_path}. Skipping download of {uri}.", category=CacheWarning)
bioimageio.spec 0.4.9post4
implementing:
    collection RDF 0.2.3
    general RDF 0.2.3
    model RDF 0.4.9
+
bioimageio.spec.partner 0.4.9post4
implementing:
    partner collection RDF 0.2.3
No validation errors for bioimageio.spec static validation of model RDF 0.4.9
Validation Warnings for bioimageio.spec static validation of model RDF 0.4.9:
{'tags': "Missing tags corresponding to bioimage.io categories: [{'framework': "
         "['tensorflow', 'pytorch', 'tensorflow.js']}, {'software': "
         "['ilastik', 'imagej', 'fiji', 'imjoy', 'deepimagej', 'n2v']}]"}

Does this mean the link of the Imjoy plugin I'm using is old or something?

jmetz commented 1 year ago

@FynnBe - lemme know if I just called an old plugin or something.

Was using:

const validator = await api.getPlugin(
  "https://raw.githubusercontent.com/bioimage-io/spec-bioimage-io/main/scripts/bio-rdf-validator.imjoy.html"
);

in my JS code... doesn't seem like it could be using an old version, unless scripts/bio-rdf-validator.imjoy.html needs to be updated?

FynnBe commented 1 year ago

We'll release the new bioimageio.spec soon, so if you could develop using https://github.com/bioimage-io/spec-bioimage-io/tree/dev for now, that would be great!

jmetz commented 1 year ago

:facepalm: of course - sorry about that!

Switched and get the same results as above:

I can see that the validator (https://github.com/bioimage-io/spec-bioimage-io/blob/dev/scripts/bio-rdf-validator.imjoy.html) is clearly only going to be using a slightly older published version of bioimageio.spec, so it's still running the older code and therefore the string index error occurs.

Really then, this issue can be renamed to something about the imjoy plugin script still pulling an older version of the package (v0.4.8) instead of the current (v0.4.9post4).

jmetz commented 1 year ago

Correction: the command python bioimageio/spec/ validate rdf.yaml no longer runs, as you're presumably removing that as outlined in #516 - so ignore that about being able to run the dev-branch locally from this module :sweat_smile:

jmetz commented 1 year ago

@FynnBe If if update the requirement of bioimageio.spec used in scripts/bio-rdf-validator.imjoy.html - how do you feel about dropping the version altogether... :eyes:

I know version specification is generally good... but with a web-app I'm not so sure. My feeling is it should always be the latest version, so just switching from

https://github.com/bioimage-io/spec-bioimage-io/blob/54821967eff84965d9027ab54544ea12e8abc80f/scripts/bio-rdf-validator.imjoy.html#L17C21-L17C36

(edit: opps, thought that would render the lines...)

  "requirements": ["bioimageio.spec==0.4.8"],

To

   "requirements": ["bioimageio.spec"],

would make most sense to me at the moment. Alternatively I can pop in the latest version of the spec... and someone has to remember to update all the references every time the spec is updated :shrug:

jmetz commented 1 year ago

@oeway I guess you should probably weigh in on this:

@FynnBe If if update the requirement of bioimageio.spec used in scripts/bio-rdf-validator.imjoy.html - how do you feel about dropping the version altogether... šŸ‘€

I know version specification is generally good... but with a web-app I'm not so sure. My feeling is it should always be the latest version, so just switching from

https://github.com/bioimage-io/spec-bioimage-io/blob/54821967eff84965d9027ab54544ea12e8abc80f/scripts/bio-rdf-validator.imjoy.html#L17C21-L17C36

(edit: opps, thought that would render the lines...)

  "requirements": ["bioimageio.spec==0.4.8"],

To

   "requirements": ["bioimageio.spec"],

would make most sense to me at the moment. Alternatively I can pop in the latest version of the spec... and someone has to remember to update all the references every time the spec is updated šŸ¤·

FynnBe commented 1 year ago

@FynnBe If if update the requirement of bioimageio.spec used in scripts/bio-rdf-validator.imjoy.html - how do you feel about dropping the version altogether... šŸ‘€

I'll let @oeway decide that... ideally we would have tests for it, but as we currently don't it might just start silently failing on a version bump, which is not desirable. How about we add a test that just checks that the version pinned is the latest?

jmetz commented 1 year ago

Yeah I see what mean, and having a test that it's using the latest version would also work. That said, the end result of these two approaches is the same - the latest version will be used.. with one it happens automatically, and with the other it will requires manual updates, so my vote at least would be to go with the simple option (remove version specification so that the latest is always used).

What do you think @oeway ?

oeway commented 1 year ago

Hey, we actually started with no version specified, then we experienced issues, as @FynnBe mentioned some silent broken of the upload functionality. So I would prefer to stick with the fixed version approach. @jmetz If we want to do latest later on, maybe you can help contributing a CI script in the spec repo to make sure it will pass the pyodide environment test. Let's work on that at some point! Make sense?

jmetz commented 1 year ago

@oeway - yup, makes sense, and closing for now.

Adding what would be my approach "for the record", but don't feel strongly about it either - both work :shrug: .

So I would probably be to go with use-latest approach anyway (as it's a web app), and also devote effort to making sure that errors are never silent type thing. Then if (hopefully only occasionally) something does break with an update, a version can always be specified temporarily while a fix is worked on. This step would be no different with the always-specify-version option, as we would also have to do this in that scenario. Happy to discuss further if anyone's interested :)