arcalot / arcaflow-engine

Arcaflow is a highly-portable workflow engine enabling modular and validated pipelines through containerized plugins.
https://arcalot.io/arcaflow/
Apache License 2.0
6 stars 9 forks source link

fix output schema bug #163

Closed mfleader closed 3 months ago

mfleader commented 3 months ago

Changes introduced with this PR

Reproducible Example

Writing an outputSchema in a workflow, as seen below, results in the subsequent unexpected error.

workflow.yaml

version: v0.2.0
input:
  root: RootObject
  objects:
    RootObject:
      id: RootObject
      properties:
        name:
          type:
            type_id: string
steps:
  example:
    plugin: 
      src: quay.io/arcalot/arcaflow-plugin-template-python:0.2.1
      deployment_type: image
    input:
      name: !expr $.input.name
outputs:
  success:
    message: !expr $.steps.example.outputs.success.message
outputSchema:
  success:
    schema:
      root: RootObjectOut
      objects: 
        RootObjectOut: 
          id: RootObjectOut
          properties:
            message:
              type:
                type_id: string
Unexpected error: invalid workflow (
  Validation failed for 'outputSchema': 
    Field cannot be set (
      reflect.Value.Convert: value of type map[string]*schema.StepOutputSchema cannot be converted 
      to type map[string]interface {}))

By contributing to this repository, I agree to the contribution guidelines.

jaredoconnell commented 3 months ago

Obviously there is a bug in the existing code. My question is whether this is the correct approach. An any schema defers input validation to after the schema validation steps. I'm going to investigate this.

jaredoconnell commented 3 months ago

The original error appears to be happening on the line

f.Set(v.Convert(f.Type()))

inside of the anonymous function in unserializeToStruct in ObjectSchema. To hit the problem case, create a breakpoint within the loop in that function to stop when key == "outputSchema"

For some reason, it's expecting a map with the value of interface{}, but instead is getting *schema.StepOutputSchema. It is getting the expected value from the field of the default value.

jaredoconnell commented 3 months ago

I experimented with changing the struct Workflow field OutputSchema's data type to map[string]*schema.StepOutputSchema. When I did that, the data validation passed. But the error is now in Prepare in executor.go:

outputSchemaData, err := schema.DescribeStepOutput().Unserialize(workflow.OutputSchema[outputID])
Unexpected error: unable to decode workflow output schema success (Validation failed: Must be a map, *schema.StepOutputSchema given)

So this is worth investigating further.

webbnh commented 3 months ago

So this is worth investigating further.

Matt and I did that, and I think we have everything worked out (we'll know when he updates the PR...).

He took your suggestion and changed the OutputSchema type in the Workflow struct to match the schema, and then we noted that the failure that you pointed out is due to a double-unserialization. With that fixed, things look like they are working again.

mfleader commented 3 months ago

So this is worth investigating further.

Matt and I did that, and I think we have everything worked out (we'll know when he updates the PR...).

He took your suggestion and changed the OutputSchema type in the Workflow struct to match the schema, and then we noted that the failure that you pointed out is due to a double-unserialization. With that fixed, things look like they are working again.

Oh I figured it out on my own