sleepingcat4 / TinyStories

code to train a gpt-2 model to train it on tiny stories dataset according to the TinyStories paper
MIT License
37 stars 6 forks source link

is there any chance you can help to implement the source code of this tinystores paper? #1

Open edisondeng opened 1 year ago

edisondeng commented 1 year ago

the content of this tinystories paper is promising. But lacking of the code stops me. Any chance if you can help to implement some of the code at least as a starting point.

tks in advance

sleepingcat4 commented 1 year ago

TinyStories, paper is not necessarily a code intensive paper. If you read the paper, the main focus was to limit GPT model with very low params and much importance was given to dataset. In this repository, the have shown a very basic way, how Huggingface models can be defined to a certain param and be used on TinyStories dataset.

There's much more, complicated methods to call a Large Language model both using third-party libraries as well as writing from scratch. What method to choose depends on you. Certainly, there're cool projects to be made but that was out-of-scope for this repository and my time.

What models and how to call them, depends entirely upon you as that's something a creative endeavor as well as research. Hope that answers your question.