karpathy / nanoGPT

The simplest, fastest repository for training/finetuning medium-sized GPTs.
MIT License
37.35k stars 5.94k forks source link

Progressive training? #547

Open immartian opened 2 months ago

immartian commented 2 months ago

Hi @karpathy , thanks for sharing this interesting series. They are not only educational but also inspiring.

While I enjoy the tutorial together with my own copycat of your code, I wonder if you can advise whether it's possible to incrementally train a model, say adding Leo Tolstoy onto Shakespere, rather than retraining a data set combined time by time?

I'm asking because I wonder any new methods different from GPTs that can learn progressively like the human brain. If so, we can probably realize some level of baby intelligence without directly going to LLM.

immartian commented 2 months ago

I also found some similar arguments https://www.linkedin.com/feed/update/urn:li:activity:7233335447110696960?commentUrn=urn%3Ali%3Acomment%3A%28activity%3A7233335447110696960%2C7233471273022959616%29&dashCommentUrn=urn%3Ali%3Afsd_comment%3A%287233471273022959616%2Curn%3Ali%3Aactivity%3A7233335447110696960%29 and feel it's an imminent challenge.

WeileiZeng commented 2 months ago

Surely, you can do it. You can change the data source in any epoch, and the model will learn from it.

immartian commented 2 months ago

I guess what I'm asking is when you've pretrained a model, can you add incremental states upon it?

chris-aeviator commented 5 days ago

@immartian that's exactly what @WeileiZeng said - an epoch is the incremental next state that you are referring to and you can take a pre trained model and add more epochs to it, also with different data as the previous. You can find some useful info on what to train in which order in the Llama 3 paper (94 pages).