Open asfimport opened 5 years ago
Simeon H.K. Fitch: This GitHub issue describes the desired end state:
https://github.com/apache/arrow/issues/4802
This feature is important for users of PySpark who want to construct tensors to feed them to ML libraries such as Keras via pandas_udf
s.
Antoine Pitrou / @pitrou: Nested Python lists are now inferred correctly, but we still lack inference but nested ndarrays with "object" dtype.
Joris Van den Bossche / @jorisvandenbossche:
While we also don't yet support converting a nd array to fixed sized list array in pa.array(..)
, you can actually convert it manually:
In [39]: arr = np.arange(30).reshape(10, 3)
In [40]: arr
Out[40]:
array([[ 0, 1, 2],
[ 3, 4, 5],
[ 6, 7, 8],
...
In [41]: pa.array(arr)
...
ArrowInvalid: only handle 1-dimensional arrays
In [42]: pa.FixedSizeListArray.from_arrays(arr.ravel(order="C"), arr.shape[1])
Out[42]:
<pyarrow.lib.FixedSizeListArray object at 0x7f4025471040>
[
[
0,
1,
2
],
[
3,
4,
5
],
[
6,
7,
8
],
...
Joris Van den Bossche / @jorisvandenbossche:
Nested Python lists are now inferred correctly, but we still lack inference but nested ndarrays with "object" dtype.
@pitrou that actually seems to work ?
In [56]: arr = np.arange(30).reshape(10, 3)
In [57]: arr = np.array(pd.Series(list(arr), dtype=object))
In [58]: arr
Out[58]:
array([array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8]),
array([ 9, 10, 11]), array([12, 13, 14]), array([15, 16, 17]),
array([18, 19, 20]), array([21, 22, 23]), array([24, 25, 26]),
array([27, 28, 29])], dtype=object)
In [59]: pa.array(arr)
Out[59]:
<pyarrow.lib.ListArray object at 0x7f406db85b20>
[
[
0,
1,
2
],
[
3,
4,
5
],
[
6,
7,
8
],
...
Antoine Pitrou / @pitrou: Yes, it's quite possible that it would work now.
Can confirm this is still broken, the given "working" example is a 1d numpy array of dtype object, not a true 2d array of dtype int
Follow up work to ARROW-4350
Reporter: Wes McKinney / @wesm
Related issues:
Note: This issue was originally created as ARROW-5645. Please see the migration documentation for further details.