Closed arcaputo3 closed 2 months ago
Hi @arcaputo3
Thanks for submiting this. I've had a look and its a little tricky to cater to this specific use case. You are correct that the main issue is the Union[TextBlock, ToolUseBlock]
definition. Spark doesn't have the concept of Union
types so its hard to translate this is any general way. In some parts of the code we look for the first element defined in the union but this doesn't feel like an appropriate solution for this.
The override method would work but unfortunately the way I've implemented it makes it a little difficult to use so I'll need to spend some time working on a better solution for this. I am thinking if you could do something like this:
class A(SparkModel):
a: str
class B(SparkModel):
b: str
class UnionOverride(SparkModel):
mapping: Union[A, B] = Field(spark_type=MapType(StringType(), StringType()))
Then you can atleast have an 'escape hatch' that lets you represent more generic types as a MapType
. What are your thoughts?
Hi @mitchelllisle this should work for now. Thank you for the reply!
I'm attempting to generate a Spark schema for Anthropic
Message
types. For certain types, we attempt to check if it is a subclass ofEnum
, but this fails when the type itself is not a class. I will continue to investigate, but I believe this is due toMessage
having acontent
field of typecontent: List[ContentBlock]
whereContentBlock = Annotated[Union[TextBlock, ToolUseBlock], PropertyInfo(discriminator="type")]
Traceback:
https://github.com/mitchelllisle/sparkdantic/blob/a3f88bd5cda82f66116ea0e82490fbab21014123/src/sparkdantic/model.py#L385