Open anntzer opened 5 years ago
This is one case that the Wadler-Leijen layout algorithm does not handle well - here's an excerpt from the docs of a prettyprinter library in Haskell that uses the same layout algorithm:
If this were implemented, it would have to be special cased to values for which we know the printed width easily without doing the actual rendering, such as the numbers here (len(repr(1.23))
is pretty cheap). That would allow the pretty printer defined in the extra to manually calculate the spaces needed to align the columns.
Taking the repr of the value from numpy could work, but it would also make the repr'd part uncolored, because colored output is achieved in this library by annotating layout primitives with the kind of syntax element it represents (and then applying a theme to decide the final color to output). pretty_call
and other functions provided by the library do that automatically, but a repr string does not know which parts of the string should be displayed as number colors and which ones as punctuation (commas), etc. This is why I'd like to avoid that workaround in the numpy extra in this lib.
But I do think this should be possible to implement for this specific context (matrices of bools/numbers). Rendering matrices will have to ignore the main thing the layout algorithm in this library optimizes for, maintaining a maximum line-length, but for matrices the alignment is more useful and a more important goal. So it would also produce a prettier output. But the implementation does need a bit of knowledge about the internal layout primitives to do properly--I'll keep this feature in mind next time I spend time hacking on this lib :)
In the meantime, it probably makes sense for you to override the matrix prettyprinter in your project to use the workaround you're using now. I believe you can fix the indentation issue by making sure that the first line of the string you return includes the same amount of leading indentation whitespace as the remaining lines.
Thanks for the detailed writeup, I got to some reasonably working version in #50. Personally, I'm not too bothered by the lack of coloring for the numbers -- for a large "field" of identically-typed numbers, coloring is less relevant (as opposed to complex, non-uniform constructs, for which coloring is very useful); what I appreciate the most from prettyprinter is that it will manage to keep the repr properly indented even when the array is part of a larger structure (which requires additional indent levels).
Description
In #47 I added prettyprinting for numpy arrays essentially by converting them to nested lists, but this is unsatisfactory for multidimensional arrays. Indeed, numpy's array repr inserts additional spaces in order to align the elements, greatly improving legibility.
Compare with prettyprinter's current output:
(note the alignment of the second and third columns; the difference in float precision comes from the np.set_printoptions(precision=...) setting which is also lost after conversion to nested lists).
I think a better approach would be to use numpy's array repr, strip out the leading "array(" and trailing ")", and insert the rest of the list "as is", with all spaces, into prettyprinter's machinery (this would, of course, allow one to benefit from prettyprinter's logic when printing arrays nested in other values, etc.). The best I could come up with so far relies on an intermediate wrapper class:
but the indentation is wrong:
Do you have a better approach to suggest, or any hints on properly inserting a literal string repr into prettyprinter?
Thanks in advance.