ludwig-ai / ludwig

Low-code framework for building custom LLMs, neural networks, and other AI models
http://ludwig.ai
Apache License 2.0
11.17k stars 1.19k forks source link

Suggestion: transition from marshmallow to pydantic 2 #4037

Open ksbrar opened 1 week ago

ksbrar commented 1 week ago

Hey @mhabedank ,

Not to add to your workload - I just noticed you closing lots of tickets and PRs recently.

Just one suggestion (which you are already probably aware of): I would strongly consider transitioning the schemas at some point from marshmallow to Pydantic 2, which should have all of the flexibility Ludwig needs. Pointedly, I would remove the dependency on marshmallow-jsonschema first, because it has gone unmaintained for over 2 years and caused a bit of dependency hell, holding back further iteration. It's a tough one because it's inherent to the dataclass-centric design of the schemas (one dataclass = one json schema), but it's necessary IMO (though if you moved to Pydantic you would get back the dataclass-centric design anyways).

mhabedank commented 1 week ago

Hi @ksbrar, you mention my favorite word of the last 2 weeks: dependency. :-) We are currently working intensively on straightening out various dependencies, as they ensure that Ludwig is no longer usable at all on many systems. Just to name a few: torchtext, ray, numpy, etc. I'll take a look at the topic of marshmellow/pydantic. I already had the idea that the data classes should be defined language-agnostically in order to improve the implementation of third-party tools. One question for you: Would you have time to take care of this?

ksbrar commented 6 days ago

@mhabedank Unfortunately I don't have time at the moment, but I may as we get closer to the holidays.