Closed noamgat closed 6 months ago
This PR makes the testing flow nearer to real world usage, including a Llama2 tokenizer with real-world challenging cases. It also allows the unit tests to serve as a better performance / profiling benchmark to improve performance.
This PR makes the testing flow nearer to real world usage, including a Llama2 tokenizer with real-world challenging cases. It also allows the unit tests to serve as a better performance / profiling benchmark to improve performance.