jncraton / languagemodels

Explore large language models in 512MB of RAM
https://jncraton.github.io/languagemodels/
MIT License
1.18k stars 78 forks source link

Feature request: Training? #18

Closed magnusviri closed 1 year ago

magnusviri commented 1 year ago

I know this is a big feature request. But you state the target audience is for learners and teachers. I've been trying to teach people AI stuff, and honestly I'm still a learner. People are pretty clueless about AI so it doesn't take too much AI knowledge to be the "expert" in the room.

This project would be really awesome if it could train a model file or create a fine tuning.

Another thing that would make it awesome is if there was a way to show how the models it comes with were trained.

jncraton commented 1 year ago

I like this idea, but I'm not sure how to pull it off within this package. The compute requirements for training or even fine-tuning are orders of magnitude higher than inference.

One option might be to include training for a basic n-gram language model just for learning purposes.

Other packages such as nanoGPT are probably better suited for folks wanting experiment with training smaller LLMs.