Closed tuscland closed 1 month ago
I suppose this should be coupled with a tooltip to see the value with its full precision ?
Sounds like a great idea 👍 @anasstarfaoui wdyt?
Hey, everything sounds super nice! I just have one concern: what about that specific case where the user wants 3 digits for an element, and 2 for the other? I feel like making it global could be potentially restrictive and will not follow the pattern of flexibility that we tried to initiate in other parts of skore :)
What about maybe having preferences per element? So it would be fully modular.
I worked at the time on a solution who tackled this pain point:
@MarieS-WiMLDS can you tell us how it works in popular tools? For example, when you display a DataFrame in Jupyter, what is the default behavior and can it be customized?
In numpy
, you can use numpy.set_printoptions
:
import numpy as np
np.set_printoptions(precision=4)
np.array([1.123456789])
which returns: [1.1235]
These are configuration parameters for all dataframes, but you can individually round a single array (and not the others):
np.round(my_array, 2)
Note that this function can also summarize long arrays (related to https://github.com/probabl-ai/skore/issues/393):
np.set_printoptions(threshold=5)
np.arange(10)
which returns: array([0, 1, 2, ..., 7, 8, 9])
You have the equivalent in pandas
, see Options and settings, for example:
import pandas as pd
pd.set_option("display.max_rows", 999)
pd.set_option("display.precision", 5)
These are configuration parameters for all dataframes, but you can individually round a single dataframe (and not the others):
my_df.round(2)
I think it should be a parameter in the UX before thinking of a programmatically way.
@rouk1 there is an interesting discussion on mlflow, this comment.
Basically:
So I believe this issue should be more specific. Closing until we have more information.
Numbers in DataFrame tables have too many digits. This results in a poor presentation.
I would not recommend to rely on the browser locale to format things. It would make things difficult to compare for teams that work in international environments (some will have a comma decimal separator, and other will have a dot). The tool we are building is suited to data-savvy users.
So, we can offer a parameter to specify number formatting as project-level settings. The default format should allow for a precision of 3 digits.
Here is a suggested approach for formatting numbers:
Which would result in 3 project-level settings:
I'm open to improved or different ideas.