Open wjandrea opened 3 months ago
cc @attack68
Method chaining and returning self
has been the standard design for Styler for quite a bit more than 5 years. It doesn't just affect this method but many others. Also, adding the method call to a todo
object stored on the Styler for lazy evaluation is by far the most efficient means of performing this function.
I appreciate your perspective but do not believe that everyone shares it, particularly since your reported error here is based on the re-run of an edited Jupyter cell, and the method remains in memory. This error would not appear if you ran this in a script, for example, or cleared and re-ran all the Jupyter cells. Therefore this isn't a bug, but is in fact working as expected and as currently documented.
Copying a Styler may be very inefficient depending on what other parameters have been configured. Also the code for copying a styler is quite complex, so returning a copy is not a preferable solution from a code maintenance perspective, or from a user perspective if they use many functions in chain.
But the main reason this isn't a suitable change is that it would fundamentally change and break the design structure for all of the users of pandas who have already implemented programs based on its current design.
Many thanks for your feedback and post to the board.
@attack68
this isn't a bug, but is in fact working as expected and as currently documented.
Where is it documented that Styler methods modify self
? I looked at Style in the API reference and Table Visualization in the user guide, especially § Limitations — and .map_index() of course.
Secondarily, some clarifications:
This error would not appear if you ran this in a script, for example, or cleared and re-ran all the Jupyter cells.
Yes, I realize that, but lots of people use Pandas interactively and re-running everything isn't always reasonable.
adding the method call to a
todo
object stored on the Styler for lazy evaluation is by far the most efficient means of performing this function.
Absolutely, I never said anything against that.
the method remains in memory.
It remains in memory because the Styler is holding a reference to it, which I'm saying it shouldn't, ideally.
It remains in memory because the Styler is holding a reference to it, which I'm saying it shouldn't, ideally.
For arguments sake, suppose I agree with you. This is first issue of its kind in probably 7 years. It came about becuase your code had a manually imposed bug which persisted in memory of a specific IDE. If your proposal was implemented, even with appropriate breaking changes release notices, I guarantee there would be far more issues and users posting about their Styler workflows breaking 'unexpectedly'. Even if this is theoretically sensible, it is not practical for existing users.
@attack68 I'm not looking for a debate, I just want to know where it's documented.
I noticed this by messing up the callback signature, but when I fixed the problem, the error kept happening.
Example setup:
I meant to write this (makes the index values red):
But accidentally wrote this (returning a list):
This raises
ValueError: too many values to unpack (expected 2)
After fixing the callback and re-running the Jupyter cell, the issue kept appearing, which didn't make any sense to me.
I found the problem by looking at the source code:
https://github.com/pandas-dev/pandas/blob/7acd629fea2a32d1ace93ceab2b62d5f5f9b2d47/pandas/io/formats/style.py#L2001-L2015
This modifies the Styler object's
_todo
attribute and returns the Styler, which is confusing and breaks the command-query separation that Python code normally follows (e.g.list.sort()
doesn't return the sorted list; usesorted(list)
instead).I believe you could fix this by making a copy of the Styler object, modifying its
_todo
, and returning the copy..map_index()
applymap_index
instead ofmap_index
)