Open edisondeng opened 1 year ago
TinyStories, paper is not necessarily a code intensive paper. If you read the paper, the main focus was to limit GPT model with very low params and much importance was given to dataset. In this repository, the have shown a very basic way, how Huggingface models can be defined to a certain param and be used on TinyStories dataset.
There's much more, complicated methods to call a Large Language model both using third-party libraries as well as writing from scratch. What method to choose depends on you. Certainly, there're cool projects to be made but that was out-of-scope for this repository and my time.
What models and how to call them, depends entirely upon you as that's something a creative endeavor as well as research. Hope that answers your question.
the content of this tinystories paper is promising. But lacking of the code stops me. Any chance if you can help to implement some of the code at least as a starting point.
tks in advance