Closed adamreichold closed 1 year ago
Worked great! This was a very painless way to parse fixed-size ASCII and Unicode strings into numpy arrays. Thank you for doing all the work that I had aimed to do, it is really appreciated.
One small nit: it took me a while to understand that the PyASCIIArray
type was a newtype around [Py_UCS1; N]
which is the same as [u8; N]
. For some reason I thought it was a specialization of the PyArray
type, but it seems fairly different to me (feel free to disagree).
I couldn't think of any better names, but maybe one of:
NPYString
and NPYUnicode
(with a comment marking those as Py_UCS1 / S
or Py_UCS4 / U
dtypes)NPYFixedString
and NPYFixedUnicode
to make it more explicit that these are fixed-length dtypes?I thought about PyASCIIArray1
, but think that's a bad idea because Array
in this crate should refer to numpy arrays, but here we're talking about dtypes. For example, here's a numpy array:
In [5]: x = np.array(['a', 'b', 'c', 'd'], dtype='S1').reshape((2, 2))
In [6]: x
Out[6]:
array([[b'a', b'b'],
[b'c', b'd']], dtype='|S1')
and I think the precise way to refer to this is a (2, 2) numpy array of dtype S1, so the way to construct that should be explicit in rust-numpy.
Anyway, thanks for all the help!
One small nit: it took me a while to understand that the PyASCIIArray type was a newtype around [Py_UCS1; N] which is the same as [u8; N]
That is literally the first sentence of its docs. 😉 But I agree, the names are somewhat misleading as these not related to PyArray
at all. I changed them to PyFixedString
and PyFixedUnicode
to stay within our existing naming scheme but emphasize that these are strings with a fixed capacity.
@kngwyu I think this is ready to merge as I have just branched off the release-0.19
maintenance branch. Could you have another look? Thank you!
This supports Unicode arrays whose element length is know at compile time.
Closes #141