StackStorm / st2

StackStorm (aka "IFTTT for Ops") is event-driven automation for auto-remediation, incident responses, troubleshooting, deployments, and more for DevOps and SREs. Includes rules engine, workflow, 160 integration packs with 6000+ actions (see https://exchange.stackstorm.org) and ChatOps. Installer at https://docs.stackstorm.com/install/index.html
https://stackstorm.com/
Apache License 2.0
6.07k stars 749 forks source link

SET Type is not JSON serializable #5265

Closed guzzijones closed 3 years ago

guzzijones commented 3 years ago

SUMMARY

.toSet() yaql is no longer serialized to json by the workflow engine. Provide a quick summary of your bug report. While running a workflow with large input json I stumbled upon this serialization error for the workflow engine.

st2workflowengine.log:2021-05-12 17:49:46,697 140687301693512 ERROR workflows [-] [609c121c04f287f5ebf421a5] {'type': 'error', 'message': 'TypeError: Type is not JSON serializable 

STACKSTORM VERSION

st2 3.5dev (7af7122), on Python 3.6.8

OS, environment, install method

Centos 7

Steps to reproduce the problem

Create a workflow with a yaql statement that outputs a set .toSet()

ie

version: 1.0
tasks:
  valid:
    action: core.noop
    next:
      - when: <%succeeded()%>
        publish:
          - valid: <% list(alist).toSet() %>
vars:
  - alist: [1,2,3,4]

Expected Results

Workflow does not fail

Actual Results

st2workflowengine.log:2021-05-12 17:49:46,697 140687301693512 ERROR workflows [-] [609c121c04f287f5ebf421a5] {'type': 'error', 'message': 'TypeError: Type is not JSON serializable  set', 'task_id': 'valid_emails', 'route': 1}

workflow fails

guzzijones commented 3 years ago

It turns out orjson will not dump sets like the previous code did. https://github.com/StackStorm/st2/blob/940219703f420df940938bfa342ba58c3a3ad85e/st2common/st2common/fields.py#L464

Kami commented 3 years ago

Good catch - we should also start adding some tests for this functionality (aka serializing sets - it's weird we had no existing orquesta or similar test which would have caught that).

guzzijones commented 3 years ago

I am not sure where to fix this? It doesn't look like orjson can have a custom serializer for dict inputs.

guzzijones commented 3 years ago

maybe this 'default' parameter: https://github.com/ijl/orjson#default I will try it

guzzijones commented 3 years ago

yup that works. PR incoming

guzzijones commented 3 years ago

I am going to add a unit test for this.

Kami commented 3 years ago

Since this is a performance critical change, we also need a good set of micro benchmarks for it (can extend existing ones to cover new situations and various sizes with mixed types + sets).

IIRC, using custom serializer / default adds (some) overhead, but microbenchmarks results should answer that objectively and then we can decide how to proceed (since set is not used frequently and likely only in orquesta workflow it may actually make sense to fall back on orjson.dumps with custom serialization / defaults in case standard orjson.dumps throws or similar).

guzzijones commented 3 years ago

ok I added a PR https://github.com/StackStorm/st2/pull/5267