ml-explore / mlx-examples

Examples in the MLX framework
MIT License
6.22k stars 883 forks source link

mistral test failed on M3 max #58

Open thegodone opened 11 months ago

thegodone commented 11 months ago
======================================================================
FAIL: test_generate (__main__.TestMistral)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/Users/ethangodin/Github/mlx-examples/mistral/test.py", line 76, in test_generate
    self.assertEqual(tokens, expected)
AssertionError: Lists differ: [302, 272, 16762, 9588, 12807, 2867, 2135, 28723, 851[116 chars] 297] != [302, 272, 11843, 11837, 1587, 28723, 851, 349, 865, [108 chars], 13]

First differing element 2:
16762
11843

  [302,
   272,
-  16762,
-  9588,
+  11843,
+  11837,
-  12807,
?   ^ -

+  1587,
?   ^

-  2867,
-  2135,
   28723,
   851,
   349,
   865,
   264,
   1369,
   28723,
   13,
   13,
+  3381,
+  456,
+  654,
-  1014,
-  16762,
-  9588,
-  12807,
-  2867,
-  2135,
-  325,
-  28749,
-  8340,
-  28731,
-  403,
   264,
-  1587,
-  297]
+  1353,
+  11843,
+  28725,
+  368,
+  682,
+  347,
+  2240,
+  767,
+  298,
+  511,
+  28723,
+  13]

----------------------------------------------------------------------
Ran 2 tests in 20.792s

FAILED (failures=1)

python configuration:

Package                       Version
----------------------------- ---------
appnope                       0.1.2
argon2-cffi                   21.1.0
async-generator               1.10
attrs                         21.2.0
backcall                      0.2.0
backports.functools-lru-cache 1.6.4
bleach                        4.1.0
brotlipy                      0.7.0
certifi                       2021.10.8
cffi                          1.15.0
charset-normalizer            2.0.9
colorama                      0.4.4
conda                         4.11.0
conda-package-handling        1.7.3
cryptography                  36.0.0
debugpy                       1.5.1
decorator                     5.1.0
defusedxml                    0.7.1
entrypoints                   0.3
filelock                      3.13.1
fsspec                        2023.12.1
idna                          3.1
importlib-metadata            4.9.0
importlib-resources           5.4.0
ipykernel                     6.6.0
ipython                       7.30.1
ipython-genutils              0.2.0
ipywidgets                    7.6.5
jedi                          0.18.1
Jinja2                        3.0.3
jsonschema                    4.3.1
jupyter                       1.0.0
jupyter-client                7.1.0
jupyter-console               6.4.0
jupyter-core                  4.9.1
jupyterlab-pygments           0.1.2
jupyterlab-widgets            1.0.2
MarkupSafe                    2.0.1
matplotlib-inline             0.1.3
mistune                       0.8.4
mlx                           0.0.4
mpmath                        1.3.0
nbclient                      0.5.9
nbconvert                     6.3.0
nbformat                      5.1.3
nest-asyncio                  1.5.4
networkx                      3.2.1
notebook                      6.4.6
numpy                         1.26.2
packaging                     21.3
pandocfilters                 1.5.0
parso                         0.8.3
pexpect                       4.8.0
pickleshare                   0.7.5
pip                           21.3.1
prometheus-client             0.12.0
prompt-toolkit                3.0.24
ptyprocess                    0.7.0
pycosat                       0.6.3
pycparser                     2.21
Pygments                      2.10.0
pyOpenSSL                     21.0.0
pyparsing                     3.0.6
pyrsistent                    0.18.0
PySocks                       1.7.1
python-dateutil               2.8.2
pyzmq                         22.3.0
requests                      2.26.0
ruamel-yaml-conda             0.15.80
Send2Trash                    1.8.0
sentencepiece                 0.1.99
setuptools                    59.4.0
six                           1.16.0
sympy                         1.12
terminado                     0.12.1
testpath                      0.5.0
torch                         2.1.1
tornado                       6.1
tqdm                          4.62.3
traitlets                     5.1.1
typing_extensions             4.9.0
urllib3                       1.26.7
wcwidth                       0.2.5
webencodings                  0.5.1
wheel                         0.37.0
widgetsnbextension            3.5.2
zipp                          3.6.0
thegodone commented 11 months ago

but the model generates staff:

python mistral.py --prompt "It is a truth universally acknowledged,"  --temp 0
[INFO] Loading model from disk.
[INFO] Starting generation...
It is a truth universally acknowledged, that a single man in possession of a good salary, must be in want of a wife.

Or at least, that’s what the 19th century novelist Jane Austin thought.

But what about the single woman?

In the 19th century, the single woman was often seen as a burden on society. She was often seen as a burden on her family, and was often seen as a burden on the state.

The single woman was
awni commented 11 months ago

Interesting.. this could just be due to small numerical differences. We'll have to do some more extensive testing on an M3 as most of it was on an M2 or M1.