jax-ml / ml_dtypes

A stand-alone implementation of several NumPy dtype extensions used in machine learning.
Apache License 2.0
191 stars 25 forks source link

Deserialization of float8_e5m2 #148

Open wonjeon opened 4 months ago

wonjeon commented 4 months ago

I tried the following code snippet, and it doesn't seem to work. Is this an already known issue?


>>> a = float8_e5m2(1.5)
>>> np.save("a.npy", a)
>>> b = np.load("a.npy")
Traceback (most recent call last):
  File "/home/wonjeo01/.local/lib/python3.10/site-packages/numpy/lib/format.py", line 640, in _read_array_header
    dtype = descr_to_dtype(d['descr'])
  File "/home/wonjeo01/.local/lib/python3.10/site-packages/numpy/lib/format.py", line 309, in descr_to_dtype
    return numpy.dtype(descr)
TypeError: data type '<f1' not understood

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/wonjeo01/.local/lib/python3.10/site-packages/numpy/lib/npyio.py", line 432, in load
    return format.read_array(fid, allow_pickle=allow_pickle,
  File "/home/wonjeo01/.local/lib/python3.10/site-packages/numpy/lib/format.py", line 765, in read_array
    shape, fortran_order, dtype = _read_array_header(
  File "/home/wonjeo01/.local/lib/python3.10/site-packages/numpy/lib/format.py", line 643, in _read_array_header
    raise ValueError(msg.format(d['descr'])) from e
ValueError: descr is not a valid dtype descriptor: '<f1'
jakevdp commented 4 months ago

Thanks - yeah this is a known issue (similar to what's reported in https://github.com/google/jax/discussions/8494).

Unfortunately, numpy's serialization only recognizes numpy's built-in dtypes, and the package currently offers no way to extend that. The best workaround for the time being would be something like this:

>>> np.save('a.npy', a.view('uint8'))
>>> np.load('a.npy').view(float8_e5m2)
array(1.5, dtype='float8_e5m2')
wonjeon commented 4 months ago

@jakevdp Thanks for your response and the information on the workaround. Confirmed that it works.