mozilla-ai / lm-buddy

Your buddy in the (L)LM space.
Apache License 2.0
63 stars 3 forks source link

New mechanics for handling path types #91

Closed sfriedowitz closed 7 months ago

sfriedowitz commented 7 months ago

What's changing

This PR overhauls the path specification mechanics in the library. Instead of using config data classes, I implement a validation type called AssetPath that validates a string for having a prefix in the following:

This approach is analogous to how many other libraries specify paths, and I believe is a more direct way of doing so. It also lets me change the config field from load_from to path, which makes much more semantic sense in my head.

Follow-ups

You'll also notice that I changed the result types from the jobs a bit, to actually include the W&B artifacts directly. In a follow-up, I plan to centralize where the artifact logging is occurring within the LMBuddy class, which should simplify much of the conditional logic that is currently found within the job entrypoints.

sfriedowitz commented 7 months ago

How much of this will we be able to port to the platform if we need to?

I think it will help with the platform a lot! For instance, being able to specify hf:// vs file:// vs s3:// will likely be the difference between raw HF models and internal models that we have trained.

It's good groundwork for some of those goals IMO.