Open ZacWarham opened 3 months ago
This has to do with the format code used in the display logic: https://github.com/PrairieLearn/PrairieLearn/blob/master/apps/prairielearn/elements/pl-dataframe/pl-dataframe.py#L132
Is there a reason to avoid using this? The format code is designed to make the numbers easier to read (avoids displaying excessive zeros).
This has to do with the format code used in the display logic: https://github.com/PrairieLearn/PrairieLearn/blob/master/apps/prairielearn/elements/pl-dataframe/pl-dataframe.py#L132
Is there a reason to avoid using this? The format code is designed to make the numbers easier to read (avoids displaying excessive zeros).
Easier to read is very subjective. It can be more complex for people (like myself) to have to convert these numbers in their head for comparisons and equations, particularly if writing on paper
So, in theory it seems like it should already be possible to control this, or in the worst case you could fork the element for your course, but I ran into an issue trying it out.
If you omit the digits
attribute on pl-dataframe
(causing num_digits = None
on the Python side) then it looks like that code path is skipped and large values are no longer displayed as scientific notation. However, when I tried it out, it looks like very small values (e.g. several zeroes after the decimal followed by some digits) are simply truncated as 0, and for some reason 6 decimal places are still being displayed as a fixed precision. Looking around briefly, I couldn't tell where exactly the truncation and 6-digit default are being set. Panda's to_dict
does seem to be preserving float64.
For illustration, I messed with the input data for the example course element/dataframe
question:
The red value is 0.000000001184
that's being truncated. The green value was entered as 84740000000000000000.00
and preserved without scientific notation or truncation. The six decimal places (orange) are also being enforced somewhere though.
I should add that setting digits="4"
does result in showing 1.184e-09 and 8.474e+19, so it looks like the truncation is not happening during the parsing stage but somewhere later in the formatting.
Okay, it looks like the precision of 6 is just the default built into the Pandas styler. ~(However, changing pd.options.display.precision
doesn't seem to influence it as described. It would still have to be set as a format on the instantiated style here.)~ [Edit: @tdy pointed out that the correct global setting would be pd.options.styler.format.precision
; I misinterpreted the doc somehow.] This would only affect the unstyled columns though (the float columns when digits
is omitted).
[Since the precision was not the original point raised...] It seems like you'd have to just fork the element in order to change the "g" general formatter to "f" for fixed precision.
Or, maybe you'd like to PR a new feature for this.
@echuber2 Is the consensus here just that adding an option to use fixed precision formatting would resolve this issue? That's a pretty easy change to make, happy to open a PR that does this.
@eliotwrobson If I'm recalling the details correctly then it seems that feature would help, and also, we'd need to make sure the digits attribute is still being respected.
Is there a way to stop dataframes from converting to scientific notation? Digits does not seem to solely control this.