Open marcverhagen opened 2 years ago
I don't think this has anything to do with parameters. All app parameter holding is occurring in clams-python
but the validation error is coming from Mmif
object, which is technically a part of mmif-python
package.
Is base_mmif
object a proper MMIF string? Looks like there was an error during the app's processing and the result of the errored process is not a well-formed MMIF.
Yeah, it does look like it is not the parameter handling. But this does get triggered by a parameter though. If I start a container and do curl http://0.0.0.0:5000?pretty
all is fine. But if I do curl http://0.0.0.0:5000?prettyx
I get
{"message": "Internal Server Error"}
Will spend some quality time tracking that down.
I meant "Yeah, it does look like it is not the parameter handling that is directly responsible for the error."
Oh I see. It's surely an issue with parameter handling then. I'll take a look at the SDK code and see what we can do with invalid/undefined parameters.
[EDIT] It turned out that the SDK does not handle the parameters at all (current code), so it has to be an app in the pipeline that raise some exception when given with an undefined parameter.
By the way, this is new behavior, I know that in the past I have been able to send in any parameters I want, even if the app did not define it. But I will test and see which applications actually have this issue.
Further investigation revealed that the SDK is actually "rejecting" undefined parameters (see https://github.com/clamsproject/clams-python/issues/101). But the error message from the rejection is something different from what's reported in this issue.
I tried two example configurations
py start_pipeline.py config/tokenizer-spacy.yml
py start_pipeline.py config/tokenizer-spacy-params.yml
with an example mmif file with parameters above.
py run_pipeline.py examples/mmif/tokenizer-spacy-2.json out.mmif --params tokenizer-eol=True,spacy-linking=True
But in both cases, I didn't get the above python exception (ill-formed MMIF error).
Yeah, I got the same. But then I noted that the spaCy app is still stuck on clams-python==0.5.0.
More interestingly, I started two versions of app-nlp-example:
$ docker build -t clams-nlp-example:0.0.7 .
$ docker build -t clams-nlp-example:0.0.7-dev -f Dockerfile.dev .
$ docker run --name example1 --rm -d -p 5000:5000 -v $PWD/input/data:/data clams-nlp-example:0.0.7
$ docker run --name example2 --rm -d -p 5005:5000 -v $PWD/input/data:/data clams-nlp-example:0.0.7-dev
Behold the difference between the Gunicorn and Flask versions:
$ curl http://0.0.0.0:5000?prettyx
{"message": "Internal Server Error"}
$ curl http://0.0.0.0:5005?prettyx
...
...
File "/usr/local/lib/python3.6/site-packages/clams/restify/__init__.py", line 180, in cast
raise KeyError(f"An unknown configuration is passed as a parameter: {k} = {v}")
KeyError: 'An unknown configuration is passed as a parameter: prettyx = '
-->
Okay, so the short (and not-so-helpful) error message issue might be coming from flask built-in server, while gunicorn can handle and successfully re-throw the python exception (KeyError
), I guess? The check on invalid parameter was surely added in 0.5.1, and I can revert the change to release 0.5.2 (with updated python in docker images) quite soon, if that's what we want.
To recap, the clams-python
SDK once added code that rejected undefined runtime parameters, treating them as invalid. (https://github.com/clamsproject/clams-python/commit/f18de21d3ce2dd896aab34843bc04f3fabcfb4d6 , included in 0.5.1
release)
Recent PR (https://github.com/clamsproject/clams-python/pull/106) reverted the rejection code and instead implemented issuing of runtime warnings when undefined parameters are passed. This PR will be included in the next release (not included in 0.5.2
).
Closing the issue as addressed.
I shouldn't have closed this issue because the runner codebase isn't yet updated to use the updated clams-python
SDK code. Re-opening this.
The documentation has an example for adde parameters:
However, the new clams-python does not allow undefined parameters:
It is a bit puzzling to me why we get this message. But in general, the pipeline code should check for invalid parameters using the app metadata.