Executions can not be proceeded with errors in st2actionrunner and st2scheduler

yypptest commented 2 years ago

SUMMARY

We picked up stackstorm v3.6.0 and deployed on kubernetes, the pods can started normally but execution always stuck at requested status. The st2scheduler pod showed below log messages, and st2actionrunner also showed similar error exceptions.

2021-12-06 06:42:44,326 DEBUG [-] Using cached coordinator instance: <tooz.drivers.etcd.EtcdDriver object at 0x7f68890b6b70>
2021-12-06 06:42:44,327 ERROR [-] Traceback (most recent call last):

2021-12-06 06:42:44,327 ERROR [-]
2021-12-06 06:42:44,328 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/eventlet/hubs/poll.py", line 111, in wait
    listener.cb(fileno)

2021-12-06 06:42:44,328 ERROR [-]
2021-12-06 06:42:44,328 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/eventlet/greenthread.py", line 221, in main
    result = function(*args, **kwargs)

2021-12-06 06:42:44,328 ERROR [-]
2021-12-06 06:42:44,328 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/metrics/base.py", line 216, in wrapper
    return func(*args, **kw)

2021-12-06 06:42:44,328 ERROR [-]
2021-12-06 06:42:44,328 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2actions/scheduler/handler.py", line 314, in _handle_execution
    self._schedule(liveaction_db, execution_queue_item_db)

2021-12-06 06:42:44,328 ERROR [-]
2021-12-06 06:42:44,328 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2actions/scheduler/handler.py", line 422, in _schedule
    self._update_to_scheduled(liveaction_db, execution_queue_item_db)

2021-12-06 06:42:44,328 ERROR [-]
2021-12-06 06:42:44,328 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2actions/scheduler/handler.py", line 481, in _update_to_scheduled
    publish=False,

2021-12-06 06:42:44,328 ERROR [-]
2021-12-06 06:42:44,328 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/services/action.py", line 236, in update_status
    liveaction, set_result_size=set_result_size

2021-12-06 06:42:44,328 ERROR [-]
2021-12-06 06:42:44,328 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/st2common/services/executions.py", line 199, in update_execution
    with coordination.get_coordinator().get_lock(liveaction_db.id):

2021-12-06 06:42:44,328 ERROR [-]
2021-12-06 06:42:44,329 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/tooz/drivers/etcd.py", line 255, in get_lock
    return EtcdLock(self.lock_encoder.check_and_encode(name), name,
2021-12-06 06:42:44,329 ERROR [-]
2021-12-06 06:42:44,329 ERROR [-]   File "/opt/stackstorm/st2/lib/python3.6/site-packages/tooz/utils.py", line 40, in check_and_encode
    " or binary type and not %s" % type(name))

2021-12-06 06:42:44,329 ERROR [-]
2021-12-06 06:42:44,329 ERROR [-] TypeError: Provided lock name is expected to be a string or binary type and not <class 'bson.objectid.ObjectId'>

STACKSTORM VERSION

st2: 3.6.0

OS, environment, install method

Deployed helm chart to OCP 4.8 Coordinator: etcd

Steps to reproduce the problem

Deployed stackstorm 3.6.0 with helm chart to OCP 4.8
All pods started normally
Run basic actions, but all actions stuck at requested status
Check log found the error TypeError: Provided lock name is expected to be a string or binary type and not <class 'bson.objectid.ObjectId'>

Expected Results

The actions can be proceeded normally even with etcd coordinator.

Actual Results

There's error reported when proceeding actions.

Making sure to follow these steps will guarantee the quickest resolution possible.

Thanks!

yypptest commented 2 years ago

The exception seems from the new codes in v3.6.0 from the file /opt/stackstorm/st2/lib/python3.6/site-packages/st2common/services/executions.py, when calling with coordination.get_coordinator().get_lock(liveaction_db.id)