mistral test failed on M3 max #58

Open thegodone opened 10 months ago

thegodone commented 10 months ago
FAIL: test_generate (__main__.TestMistral)
Traceback (most recent call last):
  File "/Users/ethangodin/Github/mlx-examples/mistral/", line 76, in test_generate
    self.assertEqual(tokens, expected)
AssertionError: Lists differ: [302, 272, 16762, 9588, 12807, 2867, 2135, 28723, 851[116 chars] 297] != [302, 272, 11843, 11837, 1587, 28723, 851, 349, 865, [108 chars], 13]

First differing element 2:

-  16762,
-  9588,
+  11843,
+  11837,
-  12807,
?   ^ -

+  1587,
?   ^

-  2867,
-  2135,
+  3381,
+  456,
+  654,
-  1014,
-  16762,
-  9588,
-  12807,
-  2867,
-  2135,
-  325,
-  28749,
-  8340,
-  28731,
-  403,
-  1587,
-  297]
+  1353,
+  11843,
+  28725,
+  368,
+  682,
+  347,
+  2240,
+  767,
+  298,
+  511,
+  28723,
+  13]

Ran 2 tests in 20.792s

FAILED (failures=1)

python configuration:

thegodone commented 10 months ago

but the model generates staff:

python --prompt "It is a truth universally acknowledged,"  --temp 0
[INFO] Loading model from disk.
[INFO] Starting generation...
It is a truth universally acknowledged, that a single man in possession of a good salary, must be in want of a wife.

Or at least, that’s what the 19th century novelist Jane Austin thought.

But what about the single woman?

In the 19th century, the single woman was often seen as a burden on society. She was often seen as a burden on her family, and was often seen as a burden on the state.

The single woman was
awni commented 10 months ago

Interesting.. this could just be due to small numerical differences. We'll have to do some more extensive testing on an M3 as most of it was on an M2 or M1.