simw / pydantic-to-pyarrow

A library to convert a pydantic model to a pyarrow schema
MIT License
19 stars 2 forks source link

Proper handling of field aliases #18

Open CallumMcMahon opened 5 days ago

CallumMcMahon commented 5 days ago

First of all thanks for making such a useful tool!

For context I started with a json-schema file. I then used datamodel_code_generator to create a pydantic model.

datamodel-codegen --input schema.json --output model.py --output-model-type pydantic_v2.BaseModel

The schema itself has a field named yield, which that tool correctly converts to yield_: <type> = Field(alias="yield").

However when running this tool to convert the resulting model into a pyarrow schema, the final pyarrow schema field is called yield_. The expected output would be a pyarrow field called yield based on the alias name.

Thanks

simw commented 17 hours ago

Thank you for raising the issue!

Yes, previously this library completely ignored the alias settings. Looking at the pydantic docs, it looks like Model.model_dump() can take a by_alias parameter. By default, model_dump will ignore the alias, but when by_alias is set to True then it will use the (serialization) alias. (https://docs.pydantic.dev/latest/concepts/alias/)

I've updated the code in PR #19 . (Let me know if you see any issues in that). I'm also going to have a quick look at whether the code just works in python 3.13, and then will make a release with the new functionality.

Thanks again!

simw commented 16 hours ago

https://github.com/simw/pydantic-to-pyarrow/releases/tag/v0.1.4