blaze / datashape

Language defining a data description protocol
BSD 2-Clause "Simplified" License
183 stars 65 forks source link

Remove (or hide) object type? #120

Open mrocklin opened 9 years ago

mrocklin commented 9 years ago

Object type isn't seeing any love and, at least in Blaze, I find it useful to interpret numpy dtype 'O' as string. There are trade-offs here. Thought I'd ask for discussion.

@mwiebe thoughts?

mwiebe commented 9 years ago

You could adopt the h5py custom 'O' dtype for this. np.dtype('O', metadata={'vlen' : str}) This is what dynd converts strings to when going to numpy.

Dealing with numpy's dtype metadata is a bit funny though, it doesn't participate in equality, so you have to check that manually.

mrocklin commented 9 years ago

I'm more concerned about the case where someone gives me a numpy array and I'm asked to discover its datashape. My concern here is somewhat abated by adding a heuristic to the discover(np.ndarray) function in #121 .