Closed pprp closed 3 months ago
Hi @pprp
Thanks for your interest in our work. This repo contains tutorial style code to illustrate how TTT works. No pre-trained checkpoints are loaded, so text generation results will be random. To make language models that follow user input typically requires pre-training on massive data and instruction fine-tuning. It will be part of our future works.
When running the provided example code for the TTT (Learning to Learn at Test Time) model, the output generated by the model is not coherent or meaningful. The expected output for the prompt "Greeting from TTT!" should be a relevant and sensible continuation of the text, but the actual output is gibberish, containing a mix of characters and words that do not form a logical sequence.
We got: