Open rakesh-163 opened 2 months ago
Something in sqlalchemy
is calling os.environ
but they choose to swallow that stack trace and wrap with "One or more mappers failed to initialize" so you can't see where. I would recommend not using sqlalchemy
ORM objects inside a workflow, but instead have simpler dataclass
objects you translate to sqlalchemy
equivalents in activities as needed.
Hey Chad, Thanks for the reply. Appreciate that you looked inside the sqlalchemy codebase. Could you point me to the line of code that is doing the os.environ call? I could not find it when I grepped for it.
Could you point me to the line of code that is doing the os.environ call? I could not find it when I grepped for it.
It may be nested in something else and not directly called. To debug, first you'd need to patch sqlalchemy to not swallow the true stack trace of why a mapper fails to initialize. Probably around https://github.com/sqlalchemy/sqlalchemy/blob/rel_2_0_35/lib/sqlalchemy/orm/mapper.py#L4232-L4252. You need to true stack trace of that exception. Regardless, I would not recommend using sqlalchemy models in workflows because they are likely non-deterministic.
What are you really trying to do?
I have an activity that calls a function that performs database operations. The first operation in that activity is a read. The SQLModel that I am trying to read is called a "Story".
Describe the bug
The activity runs fine... until it starts to fail and cause failures for the workflow that calls it... It says something about the workflow accessing os.environ.get (See stack trace below) but the function that calls does not have that. So, it is either that the failure should not occur or it may be that it is telling the wrong reason why it occurs. In either case, it is a bug.
Also, I have scoured the library (i.e. SQLAlchemy) module for signs of an os.environ.get, I did not find any.
I have also passed through all external library imports at this point...
Any help would be appreciated!
Minimal Reproduction
It is hard to reproduce because a lot of the times the activity just succeeds. It is usually after an hour or so, this particular activity starts to fail.
Environment/Versions
Additional context
Here's the full stack trace that I see when the failure occurs:
{"message":"Failed decoding arguments","stackTrace":" File \"/usr/local/lib/python3.12/site-packages/temporalio/worker/_workflow_instance.py\", line 326, in activate\n self._apply(job)\n\n File \"/usr/local/lib/python3.12/site-packages/temporalio/worker/_workflow_instance.py\", line 422, in _apply\n self._apply_resolve_activity(job.resolve_activity)\n\n File \"/usr/local/lib/python3.12/site-packages/temporalio/worker/_workflow_instance.py\", line 654, in _apply_resolve_activity\n ret_vals = self._convert_payloads(\n ^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/usr/local/lib/python3.12/site-packages/temporalio/worker/_workflow_instance.py\", line 1563, in _convert_payloads\n raise RuntimeError(\"Failed decoding arguments\") from err\n","cause":{"message":"One or more mappers failed to initialize - can't proceed with initialization of other mappers. Triggering mapper: 'Mapper[Story(story)]'. Original exception was: Cannot access os.environ.get from inside a workflow. If this is code from a module not used in a workflow or known to only be used deterministically from a workflow, mark the import as pass through.","stackTrace":" File \"/usr/local/lib/python3.12/site-packages/temporalio/worker/_workflow_instance.py\", line 1555, in _convert_payloads\n return self._payload_converter.from_payloads(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/usr/local/lib/python3.12/site-packages/temporalio/converter.py\", line 307, in from_payloads\n values.append(converter.from_payload(payload, type_hint))\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/usr/local/lib/python3.12/site-packages/temporalio/converter.py\", line 583, in from_payload\n obj = value_to_type(type_hint, obj, self._custom_type_converters)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/usr/local/lib/python3.12/site-packages/temporalio/converter.py\", line 1533, in value_to_type\n return getattr(hint, \"parse_obj\")(value)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/usr/local/lib/python3.12/site-packages/typing_extensions.py\", line 2853, in wrapper\n return arg(*args, *kwargs)\n ^^^^^^^^^^^^^^^^^^^^\n\n File \"/usr/local/lib/python3.12/site-packages/sqlmodel/main.py\", line 951, in parse_obj\n return cls.model_validate(obj, update=update)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File \"/usr/local/lib/python3.12/site-packages/sqlmodel/main.py\", line 848, in model_validate\n return sqlmodel_validate(\n ^^^^^^^^^^^^^^^^^^\n\n File \"/usr/local/lib/python3.12/site-packages/sqlmodel/_compat.py\", line 311, in sqlmodel_validate\n new_obj = cls()\n ^^^^^\n\n File \"\", line 4, in init\n\n File \"/usr/local/lib/python3.12/site-packages/sqlalchemy/orm/state.py\", line 566, in _initialize_instance\n manager.dispatch.init(self, args, kwargs)\n\n File \"/usr/local/lib/python3.12/site-packages/sqlalchemy/event/attr.py\", line 497, in call\n fn( args, **kw)\n\n File \"/usr/local/lib/python3.12/site-packages/sqlalchemy/orm/mapper.py\", line 4396, in _event_on_init\n instrumenting_mapper._check_configure()\n\n File \"/usr/local/lib/python3.12/site-packages/sqlalchemy/orm/mapper.py\", line 2388, in _check_configure\n _configure_registries({self.registry}, cascade=True)\n\n File \"/usr/local/lib/python3.12/site-packages/sqlalchemy/orm/mapper.py\", line 4204, in _configure_registries\n _do_configure_registries(registries, cascade)\n\n File \"/usr/local/lib/python3.12/site-packages/sqlalchemy/orm/mapper.py\", line 4241, in _do_configure_registries\n raise e\n","applicationFailureInfo":{"type":"InvalidRequestError"}},"applicationFailureInfo":{"type":"RuntimeError"}}
Also, I am using this sandboxed runner on the workers to help deal with the datetime issue that the Pydantic models pose. I am not sure if this interacts with the converter bits in the stack trace above.
def new_sandbox_runner() -> SandboxedWorkflowRunner:
TODO(cretz): Use with_child_unrestricted when https://github.com/temporalio/sdk-python/issues/254