Closed palchicz closed 2 years ago
Thank you for opening your first issue here! đź›
@palchicz this is because of the underlying dataclass-json library. We are actually working on supporting #1362 Pickle types. But, pickle is not native to other languages and wont show up in the UI. Whats the usecase?
Thanks for the quick response! We have a collection of ~10 workflow inputs that are shared among several workflows. These inputs are a collection of closely related parameters, and they always show up together across different workflows, so it makes sense to group them together. Consolidating them into a single dataclass made sense, however, one of these shared inputs is an enum and thus we run into the problem stated above.
For the time being, we can still use the dataclass to group 9/10 of the parameters together so we don't have to repeatedly and explicitly lay them out for each workflow. It's just awkward to have the one odd parameter that we do need to add to the workflow definition explicitly just because it's an enum.
Here are some code example to help clarify what I said above:
class EnumParam(Enum):
OPTION_1 = "option 1"
OPTION_2 = "option 2"
# Naive approach
@workflow
def workflow_1(param_1: int, param_2: int, ...., param_n: EnumParam) -> int:
...
@workflow
def workflow_2(param_1: int, param_2: int, ...., param_n: EnumParam) -> str:
...
# Envisioned Approach
@dataclass_json
@dataclass
class WorkflowParameters(object):
param_1: int = 0
param_2: int = 0
...
param_n: EnumParam = EnumParam.OPTION_1
@workflow
def workflow_1(workflow_params: WorkflowParameters) -> int:
...
@workflow
def workflow_2(workflow_params: WorkflowParameters) -> str:
...
# Workaround
@dataclass_json
@dataclass
class WorkflowParameters(object):
param_1: int = 0
param_2: int = 0
...
@workflow
def workflow_1(workflow_params: WorkflowParameters, param_n: EnumParam) -> int:
...
@workflow
def workflow_2(workflow_params: WorkflowParameters, param_n: EnumParam) -> str:
...
@palchicz appreciate giving more clarification. your request absolutely makes sense. One follow up question, do you use the Enum Form in the UI - as shown below?
This UI feature will not work with dataclasses with nested enums, that is because dataclass_json does not support enum schema extraction AFAICT. But JSON actually supports that. Let us try that.
In anycase enums should be supported by dataclasses, and I think we just need to enable one more library for that? marshmallow_enum - https://github.com/fuhrysteve/marshmallow-jsonschema/blob/master/marshmallow_jsonschema/base.py#L22-L27.
Can you try installing the library and help us debug, and please jump into our slack channel - here and we can work on it with you.
One follow up question, do you use the Enum Form in the UI - as shown below?
That would be ideal, but not a deal breaker. At the very least there isn't any urgency, so maybe the dataclass_json people will add that feature before too long :smiley: :crossed_fingers:
I can certainly jump in and help. I should have some time next week. I'll join the slack channel like you suggested.
One follow up question, do you use the Enum Form in the UI - as shown below?
That would be ideal, but not a deal breaker. At the very least there isn't any urgency, so maybe the dataclass_json people will add that feature before too long :smiley: :crossed_fingers:
I can certainly jump in and help. I should have some time next week. I'll join the slack channel like you suggested.
Awesome, see you soon. Also, if the form is not required- enums should be supported. I think the enum marshmallow library might work
In anycase enums should be supported by dataclasses, and I think we just need to enable one more library for that? marshmallow_enum - https://github.com/fuhrysteve/marshmallow-jsonschema/blob/master/marshmallow_jsonschema/base.py#L22-L27.
Can you try installing the library and help us debug
I took a stab at debugging this problem and noticed that marshmallow_enum
is already installed. The issue is this line here in the mashmallow_jsonschema
library which raises an exception when trying to convert the marshmallow enum field into a json schema enum. If you trace the code, it wouldn't get to this point unless marshmallow_enum
was installed.
The exact reason it raises an exception is because the marshmallow Enum field has the property field.load_by == LoadDumpOptions.value
(which gets set here). The matshmallow_json libraby is probably a little too restrictive in that it just errors out whenever it encounters this case. It should probably allow strings even when the field was loaded by value.
Anyways, that's as far as I got. No solution to propose, but that at least seems to be the issue.
We are working on adding more type support into Dataclass, we will take a look at this as part of the work - cc @pingsutw
Motivation: Why do you think this is important? Since dataclasses are a supported type and Enums are a supported type, one would expect that dataclasses that reference a enum should also be supported. Instead, the protobuf fails to form properly and I get the following warning from the command
pyflyte --pkgs <PACKAGE_NAME> serialize --image <IMAGE_NAME> workflows
Goal: What should the final outcome look like, ideally? I should be able to define a dataclass and enum like the following:
Describe alternatives you've considered I've considered implementing a new custom type.
[Optional] Propose: Link/Inline OR Additional context This might have its root cause in this dependency https://github.com/fuhrysteve/marshmallow-jsonschema/blob/master/marshmallow_jsonschema/base.py#L228