Closed awni closed 1 week ago
Compare the following on the CPU / GPU. GPU gives 0/inf and the CPU looks more or less correct.
import mlx.core as mx s1 = (4, 6479, 2048) s2 = (256000, 2048) a = mx.random.uniform(shape=s1) b = mx.random.uniform(shape=s2) mx.eval(a, b) c = a @ b.T
Causing NaN in LoRA training with Gemma: https://github.com/ml-explore/mlx-examples/issues/620
Compare the following on the CPU / GPU. GPU gives 0/inf and the CPU looks more or less correct.