Open asfimport opened 5 years ago
Antoine Pitrou / @pitrou: Since it's ambiguous, I'm not sure it's a good idea to support it. The working inference case for list arrays is a list of lists:
>>> pa.array([[1,2,3],[4,5]])
<pyarrow.lib.ListArray object at 0x7f114319eb38>
[
[
1,
2,
3
],
[
4,
5
]
]
Joris Van den Bossche / @jorisvandenbossche: Yes, I understand the "ambiguous" reason, but on the other hand, StructArray is not really an option as default since for that the struct names need to be known.
Doing it automatically would allow to save such dataframes to Parquet out of the box (from ARROW-4814), but of course, you can always specify the schema manually.
In general, it would be nice to have an error message that points people towards specifying a list or struct type if you have tuples as data. But I assume this is not that easy, as the error message looks like a generic one where the value and type is filled in.
Arrays of tuples are support to be converted to either ListArray or StructArray, if you specify the type explicitly:
But not when no type is specified:
Do we want to do automatic type inference for tuples as well? (defaulting to the ListArray case, just as arrays of python lists are supported) Or was there a specific reason to not support this by default?
Reporter: Joris Van den Bossche / @jorisvandenbossche
Related issues:
Note: This issue was originally created as ARROW-5287. Please see the migration documentation for further details.