Closed lmmx closed 6 months ago
Looks very nice, thank you @lmmx! I am pretty sure there will be collisions with #57, so give us a bit of time to find the optimal way to merge it 🤗
In the meantime, if you have any other recommendations about this or other repos I maintain (like SimSIMD), please let us know!
Ah I just got the benchmarks to run and looks like I have made a mistake, I'm getting a warning from running the bench script (whereas no warning from the PyPI
packaged version).
Will put back to draft until I find. I'll take a look thanks Ash!
Update it was the LayerScale
class in gen_model.py
, I presumed it could be substituted for the other definition but seemingly not! ab47953 (the checkpoint at unum-cloud/uform-gen
expects a layer named weight
rather than gamma
). I put it in the same dataclass form 79872fd so at least the 2 classes are similar (maybe worth renaming them to distinguish explicitly). Returned this PR to ready for review.
Hi @lmmx! Thank you for the contribution!
We've looked into it with a team and can't merge it in its entirety right now, as it will introduce structural differences between public UForm and our private training repositories.
That said, the CI upgrades look interesting! Would you be open to reverting the structural changes and keeping just the CI?
Thanks again!
Fair enough, sure I will cancel this PR and make a fresh one (#62) :inbox_tray:
I wanted to review the code here to see how it worked, and I did some refactoring so I could read it more easily.
__all__
variables) and splits apart classes which do not need to share a module, for ease of reading individual components.ruff
and followed its guidance to avoid using*
imports so imports are also explicitruff
,toml-sort
andpyupgrade
(set to 3.7+). The 2 lines at the start will run this on CI (via https://pre-commit.ci) for the repo - this avoids requiring contributors to set this tool up