apache / arrow

Apache Arrow is the universal columnar format and multi-language toolbox for fast data interchange and in-memory analytics
https://arrow.apache.org/
Apache License 2.0
14.5k stars 3.53k forks source link

[C++] Add max length option to PrettyPrintOptions #30848

Open asfimport opened 2 years ago

asfimport commented 2 years ago

Some pretty prints, especially for chunked or nested arrays, can be very long even with reasonable window settings. We should have a way to set some target maximum length to output.

A half-measure was taken with ARROW-15329, which truncates the output of the pretty printing, but that doesn't handle string columns very well if those string values contain delimiters.

Reporter: Will Jones / @wjones127

Note: This issue was originally created as ARROW-15363. Please see the migration documentation for further details.

asfimport commented 2 years ago

Joris Van den Bossche / @jorisvandenbossche: See also my comment at https://github.com/apache/arrow/pull/12091#issuecomment-1016192978

Another option to consider is to have a max length on the "scalar" level, instead of the array level (in practice, this might only be relevant/necessary for variable sized types, i.e. binary and string; I think all other non-nested types have a fixed / max length)

asfimport commented 2 years ago

Will Jones / @wjones127: I like that idea; I hadn't been testing with large strings / binary, but that makes sense.