Closed cbini closed 6 months ago
Hi @cbini
That seems a useful enhancement indeed.
Would this work for everybody always? I agree that Pydantic's notation of "title" maps nicely onto JSON Schema "title".
Without knowing JSON Schema though, I might interpret Pydantic's "title" concept as a human-friendly name in addition to the system-friendly class name. I guess Pydantic allows any string as title? Say including spaces? That would not necessarily lead to a valid Avro record schema...
Hi @cbini, do you still require this?
This would be a nice-to-have in a future version. The workaround I've implemented now is good enough though--haven't had much bandwidth to revisit it.
Understood. Let’s keep this open.
I’m thinking this should be controlled through an additional option flag similar to what we use for Pydantic field aliases.
This PR ^ add the following option USE_CLASS_ALIAS
, similar to USE_FIELD_ALIAS
. On top of that, it implements Avro name validation across the board.
I'm migrating over to using Pydantic for defining and this library to generate Avro schema. It's awesome, but one challenge I'm running into is being able to override record names to maintain backwards compatibility.
For example, if I have one data asset partitioned across multiple Avro files in my data lake, BigQuery cannot successfully piece them together unless each file has consistent record names.
The workaround I've found is to subclass my Pydantic model with the legacy naming convention, but I feel like it would be much better if this library considered Pydantic's
model_config
. That way, I could keep everything in one, nicely named class, like so:Pydantic's
BaseModel.model_json_schema()
method behaves this way, but the waypas.generate()
is set up, it just uses the literal class name.