p2p-ld / numpydantic

Type annotations for specifying, validating, and serializing arrays with arbitrary backends in Pydantic (and beyond)
https://numpydantic.readthedocs.io/
MIT License
66 stars 1 forks source link

`str` dtype checking is bad #4

Closed sneakers-the-rat closed 3 months ago

sneakers-the-rat commented 3 months ago

strings are the fun ones huh.

numpy represents strings as dtypes that are like <U{n}, aka they have a specific length and you can just check isinstance(..., str) or isinstance(..., np.str_) (because i'm not really sure that's a dtype anyway).

Additionally, zarr represents strings differently, using numcodecs and objects.

Need to make better dtype validation and also make the validation hooks a bit finer grained