microbiomedata / nmdc-runtime

Runtime system for NMDC data management and orchestration
https://microbiomedata.github.io/nmdc-runtime/
Other
5 stars 3 forks source link

Bump `nmdc-schema` from `10.3.0` to `10.5.5` #560

Closed eecavanna closed 3 months ago

eecavanna commented 3 months ago

In this branch, I bumped nmdc-schema from 10.3.0 to 10.5.4 in requirements/main.in, then I ran $ make update-deps to synchronize the other requirements files.

eecavanna commented 3 months ago

The GHA workflow failed. Looks like one of the containers didn't start up "in time":

docker compose --file docker-compose.test.yml run test
time="2024-06-18T20:11:26Z" level=warning msg="/home/runner/work/nmdc-runtime/nmdc-runtime/docker-compose.test.yml: `version` is obsolete"
 Container dagster-postgresql  Running
 Container mongo  Running
 Container dagster-daemon  Running
 Container fastapi  Created
 Container dagster-dagit  Running
 Container fastapi  Starting
 Container fastapi  Started
wait-for-it.sh: waiting 300 seconds for fastapi:8000
wait-for-it.sh: timeout occurred after waiting 300 seconds for fastapi:8000
wait-for-it.sh: strict mode, refusing to execute subprocess
make: *** [Makefile:39: test-run] Error 124
Error: Process completed with exit code 2.
eecavanna commented 3 months ago

I've seen this symptom before, in other Runtime PRs. I'll re-run the GHA workflow now (with no changes to the code).

eecavanna commented 3 months ago

I reproduced the failure locally and got more information. According to the output of make up-test followed by make test in my local development environment, the Runtime is not booting successfully. Instead, it is outputting this error:

ModuleNotFoundError: No module named 'nmdc_schema.nmdc_schema_accepting_legacy_ids'

The offending import statement is in nmdc_runtime/util.py (line 19).

Locally, I tried changing that line as follows:

- from nmdc_schema.nmdc_schema_accepting_legacy_ids import Database as NMDCDatabase
+ from nmdc_schema.nmdc import Database as NMDCDatabase

That eliminated the original error message (which I'll refer to as the "first-level" error), but then revealed this one (which I'll refer to as the "second-level" error):

fastjsonschema.exceptions.JsonSchemaDefinitionException: Unresolvable ref: WorkflowExecutionActivity
eecavanna commented 3 months ago

Hi @turbomam,

I checked the JSON Schema file (nmdc_schema/nmdc.schema.json) in version 10.5.4 of the nmdc-schema repository and saw that it, indeed, does not contain a definition of the WorkflowExecutionActivity class, although it does contain several refs to it (for example, this one).

@aclum mentioned that she expects that definition to be present in that version of the schema.

I looked into the history of that file in an attempt to determine when the definition was last present in the file. I found that—in terms of Git tags—it was present in v10.4.0 and absent in v10.5.0.

I wanted to bring this to your attention. I am curious what you think about it. I want to reconcile @aclum's expectation that the class definition be present, with my observation that it is absent.

pkalita-lbl commented 3 months ago

I believe the root cause is this commit which added abstract: true to the WorkflowExecutionActivity class.

As it stands today LinkML's JSON Schema generator omits abstract classes from its output during its squashing of LinkML's class hierarchy-based representation into JSON Schema's flat representation. Normally that doesn't cause a problem, but apparently it does in this case, for some reason.

I wonder if the quickest way forward right now is to remove abstract: true from the WorkflowExecutionActivity class?

turbomam commented 3 months ago

I would prefer to use the abstract: true feature as a way of educating schema users and validating data, rather than ensuring which classes are included in the JSON schema file.

I would also prefer that more of our software used the YAML, Python or SchemaView() representations of the schema rather than other derived formats, which may not retain all of the power and expressiveness of the schema

Having said that, I'm not in a position to deliver a good short-term solution for this problem, so I won't object to removing the

WorkflowExecutionActivity:
    abstract: true

as long as we take other steps to ensure that instances of this class aren't allowed in the data. Or if all else fails, an issue about this is opened in the appropriate repo (linkml/linkml?) and the class gets some documentation about the problem, including a see_also link to the GH issue.

pkalita-lbl commented 3 months ago

an issue about this is opened in the appropriate repo (linkml/linkml?)

Yes I definitely agree with that. Although as I started to try and narrow down a simple example to illustrate the issue I'm coming to realize that there may be more to it than I initially thought. I'll update again here once I'm more confident that's the right path forward.

eecavanna commented 3 months ago

Thanks, @pkalita-lbl and @turbomam!

The commit @pkalita-lbl referenced in his comment was introduced as part of the same repository release (i.e. v10.5.0) in which the definition of the WorkflowExecutionActivity starting being absent from the JSON Schema.

One thing I want to point out is that, while the definition of the class was absent, the JSON Schema file still contained references to that (absent) class, which I think is what the error message from the Runtime was referring to.

eecavanna commented 3 months ago

As it stands today LinkML's JSON Schema generator omits abstract classes from its output during its squashing of LinkML's class hierarchy-based representation into JSON Schema's flat representation.

I am wondering whether it omits the class definition, but preserves references (if any) to that class definition. If that is the case, there may be a bug in LinkML's JSON Schema generator (because I don't think references like that would be resolvable).

pkalita-lbl commented 3 months ago

there may be a bug in LinkML's JSON Schema generator

Yes I believe that's true, but I think it needs some further characterization before even thinking about what a potential fix will be. In the meantime I don't want that work to hold this up.

pkalita-lbl commented 3 months ago

I added a new commit to upgrade to the new nmdc-schema version 10.5.5 which should resolve the JSON Schema issues mentioned above.