ml-explore / mlx-examples

Examples in the MLX framework
MIT License
6.3k stars 898 forks source link

Clear cache during prompt processing #1027

Closed awni closed 1 month ago

awni commented 1 month ago

Closes #1025, see that for discussion / improvement.

awni commented 1 month ago

Yea I didn't see any difference. (And of course faster for very long prompts since you don't have nearly as much memory pressure).